nLab
geometry of physics -- A first idea of quantum field theory

\,

this entry is going to be one chapter of geometry of physics

under construction

\,

These notes mean to give an expository but rigorous introduction to the basic concepts of relativistic perturbative quantum field theories, specifically those that arise as the perturbative quantization of a Lagrangian field theory – such as quantum electrodynamics, quantum chromodynamics, and perturbative quantum gravity appearing in the standard model of particle physics.

For broad introduction of the idea of the topic of perturbative quantum field theory see there and see

Here, first we consider classical field theory (or rather pre-quantum field theory), complete with BV-BRST formalism; then its deformation quantization via causal perturbation theory to perturbative quantum field theory. This mathematically rigorous (i.e. clear and precise) formulation of the traditional informal lore has come to be known as perturbative algebraic quantum field theory.

We aim to give a fully local discussion, where all structures arise on the “jet bundle over the field bundle” (introduced below) and “transgress” from there to the spaces of field histories over spacetime (discussed further below). This “Higher Prequantum Geometry” streamlines traditional constructions and serves the conceptualization in the theory. This is joint work with Igor Khavkine.

In full beauty these concepts are extremely general and powerful; but the aim here is to give a first precise idea of the subject, not a fully general account. Therefore we concentrate on the special case where spacetime is Minkowski spacetime (def. 23 below), where the field bundle (def. 34 below) is an ordinary trivial vector bundle (example 9 below) and hence the Lagrangian density (def. 60 below) is globally defined. Similarly, when considering gauge theory we consider just the special case that the gauge parameter-bundle is a trivial vector bundle and we concentrate on the case that the gauge symmetries are “closed irreducible” (def. 23 below). But we aim to organize all concepts such that the structure of their generalization to curved spacetime and non-trivial field bundles is immediate.

This comparatively simple setup already subsumes what is considered in traditional texts on the subject; it captures the established perturbative BRST-BV quantization of gauge fields coupled to fermions on curved spacetimes – which is the state of the art. Further generalization, necessary for the discussion of global topological effects, such as instanton configurations of gauge fields, will be discussed elsewhere (see at homotopical algebraic quantum field theory).

Alongside the theory we develop the concrete examples of the real scalar field, the electromagnetic field and the Dirac field:

running examples

fieldfield bundleLagrangian densityequation of motion
real scalar fieldexpl. 10expl. 39expl. 45
Dirac fieldexpl. 35expl. 43expl. 52
electromagnetic fieldexpl. 11expl. 40expl. 46
Yang-Mills fieldexpl. 12,
expl. 13
expl. 41expl. 47
B-fieldexpl. 14expl 42expl. 48

\,

fieldPoisson bracketcausal propagatorHadamard propagatorFeynman propagator
real scalar fieldexpl. 73,
expl. 76
prop. 65def. 107def. 108
Dirac fieldexpl. 73,
expl. 49
prop. 76def. 109def. 110

\,

fieldgauge symmetrylocal BRST complexgauge fixing
electromagnetic fieldexpl. 92expl. 79expl. 106
Yang-Mills fieldexpl. 93
B-field

\,

The electromagnetic field and the Dirac field combined are the fields of quantum electrodynamics which we turn to at the end below.

\,

Acknowledgement

These notes profited greatly from discussions with Igor Khavkine.

Thanks also to Marco Benini, Klaus Fredenhagen, Arnold Neumaier, Kasia Rejzner for helpful discussion.

\, \,

A first idea of quantum field theory

\,

Geometry

The geometry of physics is differential geometry. This is the flavor of geometry which is modeled on Cartesian spaces n\mathbb{R}^n with smooth functions between them. Here we briefly review the basics of differential geometry on Cartesian spaces.

In principle the only background assumed of the reader here is

  1. usual naive set theory (e.g. Lawvere-Rosebrugh 03);

  2. the concept of the continuum: the real line \mathbb{R}, the plane 2\mathbb{R}^2, etc.

  3. the concepts of differentiation and integration of functions on such Cartesian spaces;

hence essentially the content of multi-variable differential calculus.

We now discuss:

As we uncover Lagrangian field theory further below, we discover ever more general concepts of “space” in differential geometry, such as smooth manifolds, diffeological spaces, infinitesimal neighbourhoods, supermanifolds, Lie algebroids and super Lie ∞-algebroids. We introduce these incrementally as we go along:

more general spaces in differential geometry introduced further below

higher differential geometry
differential geometrysmooth manifolds
(def. 44)
\hookrightarrowdiffeological spaces
(def. 35)
\hookrightarrowsmooth sets
(def. 36)
\hookrightarrowformal smooth sets
(def. 40)
\hookrightarrowsuper formal smooth sets
(def. 48)
\hookrightarrowsuper formal smooth ∞-groupoids
(not needed in fully perturbative QFT)
infinitesimal geometry,
Lie theory
infinitesimally thickened points
(def. 39)
superpoints
(def. 46)
Lie ∞-algebroids
(def. 114)
higher Lie theory
needed in QFT for:spacetime (def. 23)space of field histories
(def. 16)
Cauchy surface (def. 87),
perturbation theory (def. 84)
Dirac field (expl. 35), Pauli exclusion principleinfinitesimal gauge symmetry/BRST complex
(expl. 97)

\,

Abstract coordinate systems

What characterizes differential geometry is that it models geometry on the continuum, namely the real line \mathbb{R}, together with its Cartesian products n\mathbb{R}^n, regarded with its canonical smooth structure (def. 1 below). We may think of these Cartesian spaces n\mathbb{R}^n as the “abstract coordinate systems” and of the smooth functions between them as the “abstract coordinate transformations”.

We will eventually consider below much more general “smooth spacesXX than just the Cartesian spaces n\mathbb{R}^n; but all of them are going to be understood by “laying out abstract coordinate systems” inside them, in the general sense of having smooth functions f: nXf \colon \mathbb{R}^n \to X mapping a Cartesian space smoothly into them. All structure on generalized smooth spaces XX is thereby reduced to compatible systems of structures on just Cartesian spaces, one for each smooth “probe” f: nXf\colon \mathbb{R}^n \to X. This is called “functorial geometry”.

Notice that the popular concept of a smooth manifold (def./prop. 44 below) is essentially that o a smooth space which locally looks just like a Cartesian space, in that there exist sufficiently many f: nXf \colon \mathbb{R}^n \to X which are (open) isomorphisms onto their images. Historically it was a long process to arrive at the insight that it is wrong to fix such local coordinate identifications ff, or to have any structure depend on such a choice. But it is useful to go one step further:

In functorial geometry we do not even focus attention on those f: nXf \colon \mathbb{R}^n \to X that are isomorphisms onto their image, but consider all “probes” of XX by “abstract coordinate systems”. This makes differential geometry both simpler as well as more powerful. The analogous insight for algebraic geometry is due to Grothendieck 65; it was transported to differential geometry by Lawvere 67.

This allows to combine the best of two superficially disjoint worlds: On the one hand we may reduce all constructions and computations to coordinates, the way traditionally done in the physics literature; on the other hand we have full conceptial control over the coordinate-free generalized spaces analyzed thereby. What makes this work is that all coordinate-constructions are functorially considered over all abstract coordinate systems.

\,

Definition

(Cartesian spaces and smooth functions between them)

For nn \in \mathbb{N} we say that the set n\mathbb{R}^n of n-tuples of real numbers is a Cartesian space. This comes with the canonical coordinate functions

x k: n x^k \;\colon\; \mathbb{R}^n \longrightarrow \mathbb{R}

which send an n-tuple of real numbers to the kkth element in the tuple, for k{1,,n}k \in \{1, \cdots, n\}.

For

f: n n f \;\colon\; \mathbb{R}^{n} \longrightarrow \mathbb{R}^{n'}

any function between Cartesian spaces, we may ask whether its partial derivative along the kkth coordinate exists, denoted

fx k: n n. \frac{\partial f}{\partial x^k} \;\colon\; \mathbb{R}^{n} \longrightarrow \mathbb{R}^{n'} \,.

If this exists, we may in turn ask that the partial derivative of the partial derivative exists

2fx k 1x k 2x k 2fx k 1 \frac{\partial^2 f}{\partial x^{k_1} \partial x^{k_2}} \coloneqq \frac{\partial}{\partial x^{k_2}} \frac{\partial f}{\partial x^{k_1}}

and so on.

A general higher partial derivative obtained this way is, if it exists, indexed by an n-tuple of natural numbers α n\alpha \in \mathbb{N}^n and denoted

(1) α |α|f α 1x 1 α 2x 2 α nx n, \partial^\alpha \;\coloneqq\; \frac{ \partial^{\vert \alpha \vert} f }{ \partial^{\alpha_1} x^1 \partial^{\alpha_2} x^2 \cdots \partial^{\alpha_n} x^n } \,,

where |α|ni=1α i{\vert \alpha\vert} \coloneqq \underoverset{n}{i = 1}{\sum} \alpha_i is the total order of the partial derivative.

If all partial derivative to all orders α n\alpha \in \mathbb{N}^n of a function f: n nf \colon \mathbb{R}^n \to \mathbb{R}^{n'} exist, then ff is called a smooth function.

Of course the composition gfg \circ f of two smooth functions is again a smooth function.

n 2 f g n 1 gf n 3. \array{ && \mathbb{R}^{n_2} \\ & {}^{\mathllap{f}}\nearrow && \searrow^{\mathrlap{g}} \\ \mathbb{R}^{n_1} && \underset{g \circ f}{\longrightarrow} && \mathbb{R}^{n_3} } \,.

The inclined reader may notice that this means that Cartesian spaces with smooth functions between them constitute a category (“CartSp”); but the reader not so inclined may ignore this.

For the following it is useful to think of each Cartesian space as an abstract coordinate system. We will be dealing with various generalized smooth spaces (see the table below), but they will all be characterized by a prescription for how to smoothly map abstract coordinate systems into them.

Example

(coordinate functions are smooth functions)

Given a Cartesian space n\mathbb{R}^n, then all its coordinate functions (def. 1)

x k: n x^k \;\colon\; \mathbb{R}^n \longrightarrow \mathbb{R}

are smooth functions (def. 1).

For

f: n 1 n 2 f \colon \mathbb{R}^{n_1} \longrightarrow \mathbb{R}^{n_2}

any smooth function and a{1,2,,n 2}a \in \{1, 2, \cdots, n_2\} write

f ax kf: n 1f n 2x a f^a \coloneqq x^k \circ f \;\colon\; \mathbb{R}^{n_1} \overset{f}{\longrightarrow} \mathbb{R}^{n_2} \overset{x^a}{\longrightarrow} \mathbb{R}

. for its composition with this coordinate function.

Example

(algebra of smooth functions on Cartesian spaces)

For each nn \in \mathbb{N}, the set

C ( n)Hom CartSp( n,) C^\infty(\mathbb{R}^n) \;\coloneqq\; Hom_{CartSp}(\mathbb{R}^n, \mathbb{R})

of real number-valued smooth functions f: nf \colon \mathbb{R}^n \to \mathbb{R} on the nn-dimensional Cartesian space (def. 1) becomes a commutative associative algebra over the ring of real numbers by pointwise addition and multiplication in \mathbb{R}: for f,gC ( n)f,g \in C^\infty(\mathbb{R}^n) and x nx \in \mathbb{R}^n

  1. (f+g)(x)f(x)+g(x)(f + g)(x) \coloneqq f(x) + g(x)

  2. (fg)(x)f(x)g(x)(f \cdot g)(x) \coloneqq f(x) \cdot g(x).

The inclusion

constC ( n) \mathbb{R} \overset{const}{\hookrightarrow} C^\infty(\mathbb{R}^n)

is given by the constant functions.

We call this the real algebra of smooth functions on n\mathbb{R}^n:

C ( n)Alg. C^\infty(\mathbb{R}^n) \;\in\; \mathbb{R} Alg \,.

If

f: n 1 n 2 f \;\colon\; \mathbb{R}^{n_1} \longrightarrow \mathbb{R}^{n_2}

is any smooth function (def. 1) then pre-composition with ff (“pullback of functions”)

C ( n 2) f * C ( n 1) g f *ggf \array{ C^\infty(\mathbb{R}^{n_2}) &\overset{f^\ast}{\longrightarrow}& C^\infty(\mathbb{R}^{n_1}) \\ g &\mapsto& f^\ast g \coloneqq g \circ f }

is an algebra homomorphism. Moreover, this is clearly compatible with composition in that

f 1 *(f 2 *g)=(f 2f 1) *g. f_1^\ast(f_2^\ast g) = (f_2 \circ f_1)^\ast g \,.

Stated more abstractly, this means that assigning algebras of smooth functions is a functor

C ():CartSpAlg op C^\infty(-) \;\colon\; CartSp \longrightarrow \mathbb{R} Alg^{op}

from the category CartSp of Cartesian spaces and smooth functions between them (def. 1), to the opposite of the category \mathbb{R}Alg of \mathbb{R}-algebras.

Definition

(local diffeomorphisms and open embeddings of Cartesian spaces)

A smooth function f: n nf \colon \mathbb{R}^{n} \to \mathbb{R}^{n} from one Cartesian space to itself (def. 1) is called a local diffeomorphism, denoted

f: net n f \;\colon\; \mathbb{R}^{n} \overset{et}{\longrightarrow} \mathbb{R}^n

if the determinant of the matrix of partial derivatives (the “Jacobian” of ff) is everywhere non-vanishing

det(f 1x 1(x) f nx 1(x) f 1x n(x) f nx n(x))0AAAAfor allx n. det \left( \array{ \frac{\partial f^1}{\partial x^1}(x) &\cdots& \frac{\partial f^n}{\partial x^1}(x) \\ \vdots && \vdots \\ \frac{\partial f^1}{\partial x^n}(x) &\cdots& \frac{\partial f^n}{\partial x^n}(x) } \right) \;\neq\; 0 \phantom{AAAA} \text{for all} \, x \in \mathbb{R}^n \,.

If the function ff is both a local diffeomorphism, as above, as well as an injective function then we call it an open embedding, denoted

f: nAetA n. f \;\colon\; \mathbb{R}^n \overset{\phantom{A}et\phantom{A}}{\hookrightarrow} \mathbb{R}^n \,.
Definition

(good open cover of Cartesian spaces)

For n\mathbb{R}^n a Cartesian space (def. 1), a differentiably good open cover is

  • an indexed set

    { netAAf iAA n} iI \left\{ \mathbb{R}^n \underoverset{et}{\phantom{AA}f_i\phantom{AA}}{\hookrightarrow} \mathbb{R}^n \right\}_{i \in I}

    of open embeddings (def. 2)

such that the images

U iim(f i) n U_i \coloneqq im(f_i) \subset \mathbb{R}^n

satisfy:

  1. (open cover) every point of n\mathbb{R}^n is contained in at least one of the U iU_i;

  2. (good) all finite intersections U i 1U i k nU_{i_1} \cap \cdots \cap U_{i_k} \subset \mathbb{R}^n are either empty set or themselves images of open embeddings according to def. 2.

The inclined reader may notice that the concept of differentiably good open covers from def. 3 is a coverage on the category CartSp of Cartesian spaces with smooth functions between them, making it a site, but the reader not so inclined may ignore this.

(Fiorenza-Schreiber-Stasheff 12, def. 6.3.9)

\,

\,

fiber bundles

Given any context of objects and morphisms between them, such as the Cartesian spaces and smooth functions from def. 1 it is of interest to fix one object XX and consider other objects parameterized over it. These are called bundles (def. 4) below. For reference, we briefly discuss here the basic concepts related to bundles in the context of Cartesian spaces.

Of course the theory of bundles is mostly trivial over Cartesian spaces; it gains its main interest from its generalization to more general smooth manifolds (def./prop. 44 below). It is still worthwhile for our development to first consider the relevant concepts in this simple case first.

For more exposition see at fiber bundles in physics.

\,

Definition

(bundles)

We say that a smooth function EfbXE \overset{fb}{\to} X (def. 1) is a bundle just to amplify that we think of it as exhibiting EE as being a “space over XX”:

E fb X. \array{ E \\ \downarrow\mathrlap{fb} \\ X } \,.

For xXx \in X a point, we say that the fiber of this bundle over xx is the pre-image

(2)E xfb 1({x})E E_x \coloneqq fb^{-1}(\{x\}) \subset E

of the point xx under the smooth function. We think of fbfb as exhibiting a “smoothly varying” set of fiber spaces over XX.

Given two bundles E 1fb 1XE_1 \overset{fb_1}{\to} X and E 2fb 2XE_2 \overset{fb_2}{\to} X over XX, a homomorphism of bundles between them is a smooth function f:E 1E 2f \colon E_1 \to E_2 (def. 1) between their total spaces which respects the bundle projections, in that

fb 2f=fb 1AAAAi.e.AAAE 1 f E 2 fb 1 fb 2 X. fb_2 \circ f = fb_1 \phantom{AAAA} \text{i.e.} \phantom{AAA} \array{ E_1 && \overset{f}{\longrightarrow} && E_2 \\ & {}_{\mathllap{fb_1}}\searrow && \swarrow_{\mathrlap{fb_2}} \\ && X } \,.

Hence a bundle homomorphism is a smooth function that sends fibers to fibers over the same point:

f((E 1) x)(E 2) x. f\left( (E_1)_x \right) \;\subset\; (E_2)_x \,.

The inclined reader may notice that this defines a category of bundles over XX, which is in fact just the slice category CartSp /XCartSp_{/X}; the reader not so inclined may ignore this.

Definition

(sections)

Given a bundle EfbXE \overset{fb}{\to} X (def. 4) a section is a smooth function s:XEs \colon X \to E such that

fbs=id XAAAAA E s fb X = X. fb \circ s = id_X \phantom{AAAAA} \array{ && E \\ & {}^{\mathllap{s}}\nearrow & \downarrow\mathrlap{fb} \\ X &=& X } \,.

This means that ss sends every point xXx \in X to an element in the fiber over that point

s(x)E x. s(x) \in E_x \,.

We write

Γ X(E){ E s fb X = Xfb} \Gamma_X(E) \coloneqq \left\{ \array{ && E \\ & {}^{\mathllap{s}}\nearrow & \downarrow^\mathrlap{fb} \\ X &=& X } \phantom{fb} \right\}

for the set of sections of a bundle.

For E 1f 1XE_1 \overset{f_1}{\to} X and E 2f 2XE_2 \overset{f_2}{\to} X two bundles and for

E 1 f E 2 fb 1 fb 2 X \array{ E_1 && \overset{f}{\longrightarrow} && E_2 \\ & {}_{\mathllap{fb_1}}\searrow && \swarrow_{\mathrlap{fb_2}} \\ && X }

a bundle homomorphism between them (def. 4), then composition with ff sends sections to sections and hence yields a function denoted

Γ X(E 1) f * Γ X(E 2) s fs. \array{ \Gamma_X(E_1) &\overset{f_\ast}{\longrightarrow}& \Gamma_X(E_2) \\ s &\mapsto& f \circ s } \,.
Example

(trivial bundle)

For XX and FF Cartesian spaces, then the Cartesian product X×FX \times F equipped with the projection

X×F pr 1 X \array{ X \times F \\ \downarrow^\mathrlap{pr_1} \\ X }

to XX is a bundle (def. 4), called the trivial bundle with fiber FF. This represents the constant smoothly varying set of fibers, constant on FF

If F=*F = \ast is the point, then this is the identity bundle

X id X. \array{ X \\ \downarrow\mathrlap{id} \\ X } \,.

Given any bundle EfbXE \overset{fb}{\to} X, then a bundle homomorphism (def. 4) from the identity bundle to EfbXE \overset{fb}{\to} X is equivalently a section of EfbXE \overset{fb}{\to} X (def. 5)

X s E id fb X \array{ X && \overset{s}{\longrightarrow} && E \\ & {}_{\mathllap{id}}\searrow && \swarrow_{\mathrlap{fb}} \\ && X }
Definition

(fiber bundle)

A bundle EfbXE \overset{fb}{\to} X (def. 4) is called a fiber bundle with typical fiber FF if there exists a differentiably good open cover {U iX} iI\{U_i \hookrightarrow X\}_{i \in I} (def. 3) such that the restriction of fbfb to each U iU_i is isomorphic to the trivial fiber bundle with fiber FF over U iU_i. Such diffeomorphisms f i:U i×FE| U if_i \colon U_i \times F \overset{\simeq}{\to} E\vert_{U_i} are called local trivializations of the fiber bundle:

U i×F f i E| U i pr 1 fb| U i U i. \array{ U_i \times F &\underoverset{\simeq}{f_i}{\longrightarrow}& E\vert_{U_i} \\ & {}_{\mathllap{pr_1}}\searrow & \downarrow\mathrlap{fb\vert_{U_i}} \\ && U_i } \,.
Definition

(vector bundle)

A vector bundle is a fiber bundle EvbXE \overset{vb}{\to} X (def. 6) with typical fiber a vector space VV such that there exists a local trivialization {U i×Vf iE| U i} iI\{U_i \times V \underoverset{\simeq}{f_i}{\to} E\vert_{U_i}\}_{i \in I} whose gluing functions

U iU j×Vf i| U iU jE| U iU jf j 1| U iU jU iU j×V U_i \cap U_j \times V \overset{f_i\vert_{U_i \cap U_j}}{\longrightarrow} E\vert_{U_i \cap U_j} \overset{f_j^{-1}\vert_{U_i \cap U_j}}{\longrightarrow} U_i \cap U_j \times V

for all i,jIi,j \in I are linear functions over each point xU iU jx \in U_i \cap U_j.

A homomorphism of vector bundle is a bundle morphism ff (def. 4) such that there exist local trivializations on both sides with respect to which gg is fiber-wise a linear map.

The inclined reader may notice that this makes vector bundles over XX a category (denoted Vect /XVect_{/X}); the reader not so inclined may ignore this.

Example

(module of sections of a vector bundle)

Given a vector bundle EvbXE \overset{vb}{\to} X (def. 7), then its set of sections Γ X(E)\Gamma_X(E) (def. 4) becomes a real vector space by fiber-wise multiplication with real numbers. Moreover, it becomes a module over the algebra of smooth functions C (X)C^\infty(X) (example 2) by the same fiber-wise multiplication:

C (X) Γ X(E) Γ X(E) (f,s) (xf(x)s(x)). \array{ C^\infty(X) \otimes_{\mathbb{R}} \Gamma_X(E) &\longrightarrow& \Gamma_X(E) \\ (f,s) &\mapsto& (x \mapsto f(x) \cdot s(x)) } \,.

For E 1fb 1XE_1 \overset{fb_1}{\to} X and E 2fb 2XE_2 \overset{fb_2}{\to} X two vector bundles and

E 1 f E 2 fb 1 fb 2 X \array{ E_1 && \overset{f}{\longrightarrow} && E_2 \\ & {}_{\mathllap{fb_1}}\searrow && \swarrow_{\mathrlap{fb_2}} \\ && X }

a vector bundle homomorphism (def. 7) then the induced function on sections (def. 5)

f *:Γ X(E 1)Γ X(E 2) f_\ast \;\colon\; \Gamma_X(E_1) \longrightarrow \Gamma_X(E_2)

is compatible with this action by smooth functions and hence constitutes a homomorphism of C (X)C^\infty(X)-modules.

The inclined reader may notice that this means that taking spaces of sections yields a functor

Γ X():Vect /XC (X)Mod \Gamma_X(-) \;\colon\; Vect_{/X} \longrightarrow C^\infty(X) Mod

from the category of vector bundles over XX to that over modules over C (X)C^\infty(X).

Example

(tangent vector fields and tangent bundle)

For n\mathbb{R}^n a Cartesian space (def. 1) the trivial vector bundle (example 3, def. 7)

T n n× n tb pr 1 n = n \array{ T \mathbb{R}^n &\coloneqq& \mathbb{R}^n \times \mathbb{R}^n \\ \mathllap{tb}\downarrow && \downarrow\mathrlap{pr_1} \\ \mathbb{R}^n &=& \mathbb{R}^n }

is called the tangent bundle of n\mathbb{R}^n. With (x a) a=1 n(x^a)_{a = 1}^n the coordinate functions on n\mathbb{R}^n (def. 1) we write ( a) a=1 n(\partial_a)_{a = 1}^n for the corresponding linear basis of n\mathbb{R}^n regarded as a vector space. Then a general section (def. 5)

T n v tb n = n \array{ && T \mathbb{R}^n \\ & {}^{\mathllap{v}}\nearrow& \downarrow\mathrlap{tb} \\ \mathbb{R}^n &=& \mathbb{R}^n }

of the tangent bundle has a unique expansion of the form

v=v a a v = v^a \partial_a

where a sum over indices is understood (Einstein summation convention) and where the components (v aC ( n)) a=1 n(v^a \in C^\infty(\mathbb{R}^n))_{a = 1}^n are smooth functions on n\mathbb{R}^n (def. 1).

Such a vv is also called a smooth tangent vector field on n\mathbb{R}^n.

Each tangent vector field vv on n\mathbb{R}^n determines a partial derivative on smooth functions

C ( n) D v C ( n) f D vfv a a(f) av afx a. \array{ C^\infty(\mathbb{R}^n) &\overset{D_v}{\longrightarrow}& C^\infty(\mathbb{R}^n) \\ f &\mapsto& \mathrlap{ D_v f \coloneqq v^a \partial_a (f) \coloneqq \sum_a v^a \frac{\partial f}{\partial x^a} } } \,.

By the product law of differentiation, this is a derivation on the algebra of smooth functions (example 2) in that

  1. it is an \mathbb{R}-linear map in that

    D v(c 1f 1+c 2f 2)=c 1D vf 1+c 2D vf 2 D_v( c_1 f_1 + c_2 f_2 ) = c_1 D_v f_1 + c_2 D_v f_2
  2. it satisfies the Leibniz rule

    D v(f 1f 2)=(D vf 1)f 2+f 1(D vf 2) D_v(f_1 \cdot f_2) = (D_v f_1) \cdot f_2 + f_1 \cdot (D_v f_2)

for all c 1,c 2c_1, c_2 \in \mathbb{R} and all f 1,f 2C ( n)f_1, f_2 \in C^\infty(\mathbb{R}^n).

Hence regarding tangent vector fields as partial derivatives constitutes a linear function

D:Γ n(T n)Der(C ( n)) D \;\colon\; \Gamma_{\mathbb{R}^n}(T \mathbb{R}^n) \longrightarrow Der(C^\infty(\mathbb{R}^n))

from the space of sections of the tangent bundle. In fact this is a homomorphism of C ( n)C^\infty(\mathbb{R}^n)-modules (example 4), in that for fC ( n)f \in C^\infty(\mathbb{R}^n) and vΓ n(T n)v \in \Gamma_{\mathbb{R}^n}(T \mathbb{R}^n) we have

D fv()=fD v(). D_{f v}(-) = f \cdot D_v(-) \,.
Example

(vertical tangent bundle)

Let EfbΣE \overset{fb}{\to} \Sigma be a fiber bundle. Then its vertical tangent bundle T ΣETfbΣT_\Sigma E \overset{T fb}{\to} \Sigma is the fiber bundle (def. 6) over Σ\Sigma whose fiber over a point is the tangent bundle (def. 5) of the fiber of EfbΣE \overset{fb}{\to}\Sigma over that point:

(T ΣE) xT(E x). (T_\Sigma E)_x \coloneqq T(E_x) \,.

If EΣ×FE \simeq \Sigma \times F is a trivial fiber bundle with fiber FF, then its vertical vector bundle is the trivial fiber bundle with fiber TFT F.

Definition

(dual vector bundle)

For EvbΣE \overset{vb}{\to} \Sigma a vector bundle (def. 7), its dual vector bundle is the vector bundle whose fiber (2) over xΣx \in \Sigma is the dual vector space of the corresponding fiber of EΣE \to \Sigma:

(E *) x(E x) *. (E^\ast)_x \;\coloneqq\; (E_x)^\ast \,.

The defining pairing of dual vector spaces (E x) *E x(E_x)^\ast \otimes E_x \to \mathbb{R} applied pointwise induces a pairing on the modules of sections (def. 4) of the original vector bundle and its dual with values in the smooth functions (def. 1):

(3)Γ Σ(E) C (X)Γ Σ(E *) C (Σ) (v,α) (vα:xα x(v x)) \array{ \Gamma_\Sigma(E) \otimes_{C^\infty(X)} \Gamma_\Sigma(E^\ast) &\longrightarrow& C^\infty(\Sigma) \\ (v,\alpha) &\mapsto& (v \cdot \alpha \colon x \mapsto \alpha_x(v_x) ) }

\,

synthetic differential geometry

Below we encounter generalizations of ordinary differential geometry that include explicit “infinitesimals” in the guise of infinitesimally thickened points, as well as “super-graded infinitesimals”, in the guise of superpoints (necessary for the description of fermion fields such as the Dirac field). As we discuss below, these structures are naturally incorporated into differential geometry in just the same way as Grothendieck introduced them into algebraic geometry (in the guise of “formal schemes”), namely in terms of formally dual rings of functions with nilpotent ideals. That this also works well for differential geometry rests on the following three basic but important properties, which say that smooth functions behave “more algebraically” than their definition might superficially suggest:

Proposition

(the three magic algebraic properties of differential geometry)

  1. embedding of Cartesian spaces into formal duals of R-algebras

    For XX and YY two Cartesian spaces, the smooth functions f:XYf \colon X \longrightarrow Y between them (def. 1) are in natural bijection with their induced algebra homomorphisms C (X)f *C (Y)C^\infty(X) \overset{f^\ast}{\longrightarrow} C^\infty(Y) (example 2), so that one may equivalently handle Cartesian spaces entirely via their \mathbb{R}-algebras of smooth functions.

    Stated more abstractly, this means equivalently that the functor C ()C^\infty(-) that sends a smooth manifold XX to its \mathbb{R}-algebra C (X)C^\infty(X) of smooth functions (example 2) is a fully faithful functor:

    C ():SmthMfdAAAAAlg op. C^\infty(-) \;\colon\; SmthMfd \overset{\phantom{AAAA}}{\hookrightarrow} \mathbb{R} Alg^{op} \,.

    (Kolar-Slovak-Michor 93, lemma 35.8, corollaries 35.9, 35.10)

  2. embedding of smooth vector bundles into formal duals of R-algebra modules

    For E 1vb 1XE_1 \overset{vb_1}{\to} X and E 2vb 2XE_2 \overset{vb_2}{\to} X two vector bundle (def. 7) there is then a natural bijection between vector bundle homomorphisms f:E 1E 2f \colon E_1 \to E_2 and the homomorphisms of modules f *:Γ X(E 1)Γ X(E 2)f_\ast \;\colon\; \Gamma_X(E_1) \to \Gamma_X(E_2) that these induces between the spaces of sections (example 4).

    More abstractly this means that the functor Γ X()\Gamma_X(-) is a fully faithful functor

    Γ X():VectBund XAAAAC (X)Mod \Gamma_X(-) \;\colon\; VectBund_X \overset{\phantom{AAAA}}{\hookrightarrow} C^\infty(X) Mod

    (Nestruev 03, theorem 11.29heorem#Nestruev03))

    Moreover, the modules over the \mathbb{R}-algebra C (X)C^\infty(X) of smooth functions on XX which arise this way as sections of smooth vector bundles over a Cartesian space XX are precisely the finitely generated free modules over C (X)C^\infty(X).

    (Nestruev 03, theorem 11.32heorem#Nestruev03))

  3. vector fields are derivations of smooth functions.

    For XX a Cartesian space (example 1), then any derivation D:C (X)C (X)D \colon C^\infty(X) \to C^\infty(X) on the \mathbb{R}-algebra C (X)C^\infty(X) of smooth functions (example 2) is given by differentiation with respect to a uniquely defined smooth tangent vector field: The function that regards tangent vector fields with derivations from example 5

    Γ X(TX) AA Der(C (X)) v D v \array{ \Gamma_X(T X) &\overset{\phantom{A}\simeq\phantom{A}}{\longrightarrow}& Der(C^\infty(X)) \\ v &\mapsto& D_v }

    is in fact an isomorphism.

    (This follows directly from the Hadamard lemma.)

Actually all three statements in prop. 1 hold not just for Cartesian spaces, but generally for smooth manifolds (def./prop. 44 below; if only we generalize in the second statement from free modules to projective modules. However for our development here it is useful to first focus on just Cartesian spaces and then bootstrap the theory of smooth manifolds and much more from that, which we do below.

\,

\,

differential forms

We introduce and discuss differential forms on Cartesian spaces.

Definition

(differential 1-forms on Cartesian spaces and the cotangent bundle)

For nn \in \mathbb{N} a smooth differential 1-form ω\omega on a Cartesian space n\mathbb{R}^n (def. 1) is an n-tuple

(ω iCartSp( n,)) i=1 n \left(\omega_i \in CartSp\left(\mathbb{R}^n,\mathbb{R}\right)\right)_{i = 1}^n

of smooth functions (def. 1), which we think of equivalently as the coefficients of a formal linear combination

ω=ω idx i \omega = \omega_i d x^i

on a set {dx 1,dx 2,,dx n}\{d x^1, d x^2, \cdots, d x^n\} of cardinality nn.

Here a sum over repeated indices is tacitly understood (Einstein summation convention).

Write

Ω 1( k)CartSp( k,) ×kSet \Omega^1(\mathbb{R}^k) \simeq CartSp(\mathbb{R}^k, \mathbb{R})^{\times k}\in Set

for the set of smooth differential 1-forms on k\mathbb{R}^k.

We may think of the expressions (dx a) a=1 n(d x^a)_{a = 1}^n as a linear basis for the dual vector space n\mathbb{R}^n. With this the differential 1-forms are equivalently the sections (def. 5) of the trivial vector bundle (example 3, def. 7)

T * n n×( n) * cb pr 1 n = n \array{ T^\ast \mathbb{R}^n &\coloneqq& \mathbb{R}^n \times (\mathbb{R}^n)^\ast \\ \mathllap{cb}\downarrow && \downarrow\mathrlap{pr_1} \\ \mathbb{R}^n &=& \mathbb{R}^n }

called the cotangent bundle of n\mathbb{R}^n (def. 9):

Ω 1( n)=Γ n(T * n). \Omega^1(\mathbb{R}^n) = \Gamma_{\mathbb{R}^n}(T^\ast \mathbb{R}^n) \,.

This amplifies via example 4 that Ω 1( n)\Omega^1(\mathbb{R}^n) has the structure of a module over the algebra of smooth functions C ( n)C^\infty(\mathbb{R}^n), by the evident multiplication of differential 1-forms with smooth functions:

  1. The set Ω 1( k)\Omega^1(\mathbb{R}^k) of differential 1-forms in a Cartesian space (def. 9) is naturally an abelian group with addition given by componentwise addition

    ω+λ =ω idx i+λ idx i =(ω i+λ i)dx i, \begin{aligned} \omega + \lambda & = \omega_i d x^i + \lambda_i d x^i \\ & = (\omega_i + \lambda_i) d x^i \end{aligned} \,,
  2. The abelian group Ω 1( k)\Omega^1(\mathbb{R}^k) is naturally equipped with the structure of a module over the algebra of smooth functions C ( k)C^\infty(\mathbb{R}^k) (example 2), where the action C ( k)×Ω 1( k)Ω 1( k)C^\infty(\mathbb{R}^k) \times\Omega^1(\mathbb{R}^k) \to \Omega^1(\mathbb{R}^k) is given by componentwise multiplication

    fω=(fω i)dx i. f \cdot \omega = ( f \cdot \omega_i) d x^i \,.

Accordingly there is a canonical pairing between differential 1-forms and tangent vector fields (example 5)

(4)Γ n(T n) Γ n(T* n) ι ()() C ( n) (v,ω) ι vωv aω a \array{ \Gamma_{\mathbb{R}^n}(T \mathbb{R}^n) \otimes_{\mathbb{R}} \Gamma_{\mathbb{R}^n}(T \ast \mathbb{R}^n) &\overset{\iota_{(-)}(-) }{\longrightarrow}& C^\infty(\mathbb{R}^n) \\ (v,\omega) &\mapsto& \mathrlap{ \iota_v \omega \coloneqq v^a \omega_a } }

With differential 1-forms in hand, we may collect all the first-order partial derivatives of a smooth function into a single object: the exterior derivative or de Rham differential is the \mathbb{R}-linear function

(5)C ( n) d Ω 1( n) f dffx adx a. \array{ C^\infty(\mathbb{R}^n) &\overset{d}{\longrightarrow}& \Omega^1(\mathbb{R}^n) \\ f &\mapsto& \mathrlap{ d f \coloneqq \frac{\partial f}{ \partial x^a} d x^a } } \,.

Under the above pairing with tangent vector fields vv this yields the particular partial derivative along vv:

ι vdf=D vf=v afx a. \iota_v d f = D_v f = v^a \frac{\partial f}{\partial x^a} \,.

We think of dx id x^i as a measure for infinitesimal displacements along the x ix^i-coordinate of a Cartesian space. If we have a measure of infintesimal displacement on some n\mathbb{R}^n and a smooth function f: n˜ nf \colon \mathbb{R}^{\tilde n} \to \mathbb{R}^n, then this induces a measure for infinitesimal displacement on n˜\mathbb{R}^{\tilde n} by sending whatever happens there first with ff to n\mathbb{R}^n and then applying the given measure there. This is captured by the following definition:

Definition

(pullback of differential 1-forms)

For ϕ: k˜ k\phi \colon \mathbb{R}^{\tilde k} \to \mathbb{R}^k a smooth function, the pullback of differential 1-forms along ϕ\phi is the function

ϕ *:Ω 1( k)Ω 1( k˜) \phi^* \colon \Omega^1(\mathbb{R}^{k}) \to \Omega^1(\mathbb{R}^{\tilde k})

between sets of differential 1-forms, def. 9, which is defined on basis-elements by

ϕ *dx iϕ ix˜ jdx˜ j \phi^* d x^i \;\coloneqq\; \frac{\partial \phi^i}{\partial \tilde x^j} d \tilde x^j

and then extended linearly by

ϕ *ω =ϕ *(ω idx i) (ϕ *ω) iϕ ix˜ jdx˜ j =(ω iϕ)ϕ ix˜ jdx˜ j. \begin{aligned} \phi^* \omega & = \phi^* \left( \omega_i d x^i \right) \\ & \coloneqq \left(\phi^* \omega\right)_i \frac{\partial \phi^i }{\partial \tilde x^j} d \tilde x^j \\ & = (\omega_i \circ \phi) \cdot \frac{\partial \phi^i }{\partial \tilde x^j} d \tilde x^j \end{aligned} \,.

This is compatible with identity morphisms and composition in that

(6)(id n) *=id Ω 1( n)AAAA(gf) *=f *g *. (id_{\mathbb{R}^n})^\ast = id_{\Omega^1(\mathbb{R}^n)} \phantom{AAAA} (g \circ f)^\ast = f^\ast \circ g^\ast \,.

Stated more abstractly, this just means that pullback of differential 1-forms makes the assignment of sets of differential 1-forms to Cartesian spaces a contravariant functor

Ω 1():CartSp opSet. \Omega^1(-) \;\colon\; CartSp^{op} \longrightarrow Set \,.

The following definition captures the idea that if dx id x^i is a measure for displacement along the x ix^i-coordinate, and dx jd x^j a measure for displacement along the x jx^j coordinate, then there should be a way to get a measure, to be called dx idx jd x^i \wedge d x^j, for infinitesimal surfaces (squares) in the x ix^i-x jx^j-plane. And this should keep track of the orientation of these squares, with

dx jdx i=dx idx j d x^j \wedge d x^i = - d x^i \wedge d x^j

being the same infinitesimal measure with orientation reversed.

Definition

(exterior algebra of differential n-forms)

For k,nk,n \in \mathbb{N}, the smooth differential forms on a Cartesian space k\mathbb{R}^k (def. 1) is the exterior algebra

Ω ( k) C ( k) Ω 1( k) \Omega^\bullet(\mathbb{R}^k) \coloneqq \wedge^\bullet_{C^\infty(\mathbb{R}^k)} \Omega^1(\mathbb{R}^k)

over the algebra of smooth functions C ( k)C^\infty(\mathbb{R}^k) (example 2) of the module Ω 1( k)\Omega^1(\mathbb{R}^k) of smooth 1-forms.

We write Ω n( k)\Omega^n(\mathbb{R}^k) for the sub-module of degree nn and call its elements the differential n-forms.

Explicitly this means that a differential n-form ωΩ n( k)\omega \in \Omega^n(\mathbb{R}^k) on k\mathbb{R}^k is a formal linear combination over C ( k)C^\infty(\mathbb{R}^k) (example 2) of basis elements of the form dx i 1dx i nd x^{i_1} \wedge \cdots \wedge d x^{i_n} for i 1<i 2<<i ni_1 \lt i_2 \lt \cdots \lt i_n:

ω=ω i 1,,i ndx i 1dx i n. \omega = \omega_{i_1, \cdots, i_n} d x^{i_1} \wedge \cdots \wedge d x^{i_n} \,.

Now all the constructions for differential 1-forms above extent naturally to differential n-forms:

Definition

(exterior derivative or de Rham differential)

For n\mathbb{R}^n a Cartesian space (def. 1) the de Rham differential d:C ( n)Ω 1( n)d \colon C^\infty(\mathbb{R}^n) \to \Omega^1(\mathbb{R}^n) (5) uniquely extended as a derivation of degree +1 to the exterior algebra of differential forms (def. 11)

d:Ω ( n)Ω ( n) d \;\colon\; \Omega^\bullet(\mathbb{R}^n) \longrightarrow \Omega^\bullet(\mathbb{R}^n)

meaning that for ω iΩ k i()\omega_i \in \Omega^{k_i}(\mathbb{R}) then

d(ω 1ω 2)=(dω 1)ω 2+ω 1dω 2. d(\omega_1 \wedge \omega_2) \;=\; (d \omega_1) \wedge \omega_2 + \omega_1 \wedge d \omega_2 \,.

In components this simply means that

dω =d(ω i 1i kdx i 1dx i k) =ω i 1i kx adx adx i 1dx i k. \begin{aligned} d \omega & = d \left(\omega_{i_1 \cdots i_k} d x^{i_1} \wedge \cdots \wedge d x^{i_k}\right) \\ & = \frac{\partial \omega_{i_1 \cdots i_k}}{\partial x^{a}} d x^a \wedge d x^{i_1} \wedge \cdots \wedge d x^{i_k} \end{aligned} \,.

Since partial derivatives commute with each other, while differential 1-form anti-commute, this implies that dd is nilpotent

d 2=dd=0. d^2 = d \circ d = 0 \,.

We say hence that differential forms form a cochain complex, the de Rham complex (Ω ( n),d)(\Omega^\bullet(\mathbb{R}^n), d).

Definition

(contraction of differential n-forms with tangent vector fields)

The pairing ι vω=ω(v)\iota_v \omega = \omega(v) of tangent vector fields vv with differential 1-forms ω\omega (4) uniquely extends to the exterior algebra Ω ( n)\Omega^\bullet(\mathbb{R}^n) of differential forms (def. 11) as a derivation of degree -1

ι v:Ω +1( n)Ω ( n). \iota_v \;\colon\; \Omega^{\bullet+1}(\mathbb{R}^n) \longrightarrow \Omega^\bullet(\mathbb{R}^n) \,.

In particular for ω 1,ω 2Ω 1( n)\omega_1, \omega_2 \in \Omega^1(\mathbb{R}^n) two differential 1-forms, then

ι v(ω 1ω 2)=ω 1(v)ω 2ω 2(v)ω 1Ω 1( n). \iota_{v} (\omega_1 \wedge \omega_2) \;=\; \omega_1(v) \omega_2 - \omega_2(v) \omega_1 \;\in\; \Omega^1(\mathbb{R}^n) \,.
Proposition

(pullback of differential n-forms)

For f: n 1 n 2f \colon \mathbb{R}^{n_1} \to \mathbb{R}^{n_2} a smooth function between Cartesian spaces (def. 1) the operationf of pullback of differential 1-forms of def. 9 extends as an C ( k)C^\infty(\mathbb{R}^k)-algebra homomorphism to the exterior algebra of differential forms (def. 11),

f *:Ω ( n 2)Ω ( n 1) f^\ast \;\colon\; \Omega^\bullet(\mathbb{R}^{n_2}) \longrightarrow \Omega^\bullet(\mathbb{R}^{n_1})

given on basis elements by

f *(dx i 1dx i n)=(f *dx i 1f *dx i n). f^* \left( dx^{i_1} \wedge \cdots \wedge dx^{i_n} \right) = \left(f^* dx^{i_1} \wedge \cdots \wedge f^* dx^{i_n} \right) \,.

This commutes with the de Rham differential dd on both sides (def. 12) in that

df *=f *dAAAAAΩ (X) f * Ω (Y) d d Ω (X) f * Ω (Y) d \circ f^\ast = f^\ast \circ d \phantom{AAAAA} \array{ \Omega^\bullet(X) &\overset{f^\ast}{\longleftarrow}& \Omega^\bullet(Y) \\ \mathllap{d}\downarrow && \downarrow\mathrlap{d} \\ \Omega^\bullet(X) &\underset{f^\ast}{\longleftarrow}& \Omega^\bullet(Y) }

hence that pullback of differential forms is a chain map of de Rham complexes.

This is still compatible with identity morphisms and composition in that

(7)(id n) *=id Ω 1( n)AAAA(gf) *=f *g *. (id_{\mathbb{R}^n})^\ast = id_{\Omega^1(\mathbb{R}^n)} \phantom{AAAA} (g \circ f)^\ast = f^\ast \circ g^\ast \,.

Stated more abstractly, this just means that pullback of differential n-forms makes the assignment of sets of differential n-forms to Cartesian spaces a contravariant functor

Ω n():CartSp opSet. \Omega^n(-) \;\colon\; CartSp^{op} \longrightarrow Set \,.
Proposition

(Cartan's homotopy formula)

Let XX be a Cartesian space (def. 1), and let vΓ(TX)v \in \Gamma(T X) be a smooth tangent vector field (example 5).

For tt \in \mathbb{R} write exp(tv):XX\exp(t v) \colon X \overset{\simeq}{\to} X for the flow by diffeomorphisms along vv of parameter length tt.

Then the derivative with respect to tt of the pullback of differential forms along exp(tv)\exp(t v), hence the Lie derivative v:Ω (X)Ω (X)\mathcal{L}_v \colon \Omega^\bullet(X) \to \Omega^\bullet(X), is given by the anticommutator of the contraction derivation ι v\iota_v (def. 13) with the de Rham differential dd (def. 12):

v ddtexp(tv) *ω| t=0 =ι vdω+dι vω. \begin{aligned} \mathcal{L}_v &\coloneqq \frac{d}{d t } \exp(t v)^\ast \omega \vert_{t = 0} \\ & = \iota_v d \omega + d \iota_v \omega \,. \end{aligned}

Finally we turn to the concept of integration of differential forms (def. 15 below). First we need to say what it is that differential forms may be integrated over:

Definition

(smooth singular simplicial chains in Cartesian spaces)

For nn \in \mathbb{N}, the standard n-simplex in the Cartesian space n\mathbb{R}^n (def. 1) is the subset

Δ n{(x i) i=1 n|0x 1x n} n. \Delta^n \;\coloneqq\; \left\{ (x^i)_{i = 1}^n \;\vert\; 0 \leq x^1 \leq \cdots \leq x^n \right\} \;\subset\; \mathbb{R}^n \,.

More generally, a smooth singular n-simplex in a Cartesian space k\mathbb{R}^k is a smooth function (def. 1)

σ: n k, \sigma \;\colon\; \mathbb{R}^n \longrightarrow \mathbb{R}^k \,,

to be thought of as a smooth extension of its restriction

σ| Δ n:Δ n k. \sigma\vert_{\Delta^n} \;\colon\; \Delta^n \longrightarrow \mathbb{R}^k \,.

(This is called a singular simplex because there is no condition that Σ\Sigma be an embedding in any way, in particular σ\sigma may be a constant function.)

A singular chain in k\mathbb{R}^k of dimension nn is a formal linear combination of singular nn-simplices in k\mathbb{R}^k.

In particular, given a singular n+1n+1-simplex σ\sigma, then its boundary is a singular chain of singular nn-simplices σ\partial \sigma.

Definition

(fiber-integration of differential forms) over smooth singular chains in Cartesian spaces)

For nn \in \mathbb{N} and ωΩ n( n)\omega \in \Omega^n(\mathbb{R}^n) a differential n-form (def. 11), which may be written as

ω=fdx 1dx n, \omega = f d x^1 \wedge \cdots d x^n \,,

then its integration over the standard n-simplex Δ n n\Delta^n \subset \mathbb{R}^n (def. 14) is the ordinary integral (e.g. Riemann integral)

Δ nω0x 1x n1f(x 1,,x n)dx 1dx n. \int_{\Delta^n} \omega \;\coloneqq\; \underset{0 \leq x^1 \leq \cdots \leq x^n \leq 1}{\int} f(x^1, \cdots, x^n) \, d x^1 \cdots d x^n \,.

More generally, for

  1. ωΩ n( k)\omega \in \Omega^n(\mathbb{R}^k) a differential n-forms;

  2. C=ic iσ iC = \underset{i}{\sum} c_i \sigma_i a singular nn-chain (def. 14)

in any Cartesian space k\mathbb{R}^k. Then the integration of ω\omega over xx is the sum of the integrations, as above, of the pullback of differential forms (def. 2) along all the singular n-simplices in the chain:

Cωic i Δ n(σ i) *ω. \int_C \omega \;\coloneqq\; \underset{i}{\sum} c_i \int_{\Delta^n} (\sigma_i)^\ast \omega \,.

Finally, for UU another Cartesian space, then fiber integration of differential forms along U×CUU \times C \to U is the linear map

C:Ω +dim(C)(U×C)Ω (U) \int_C \;\colon\; \Omega^{\bullet + dim(C)}(U \times C) \longrightarrow \Omega^\bullet(U)

which on differential forms of the form ω Uω\omega_U \wedge \omega is given by

Cω Uω(1) |ω U| Cω. \int_C \omega_U \wedge \omega \;\coloneqq\; (-1)^{\vert \omega_U\vert} \int_C \omega \,.
Proposition

(Stokes theorem for fiber-integration of differential forms)

For Σ\Sigma a smooth singular simplicial chain (def. 15) the operation of fiber-integration of differential forms along U×Σpr 1UU \times \Sigma \overset{pr_1}{\longrightarrow} U (def. 15) is compatible with the exterior derivative d Ud_U on UU (def. 12) in that

d Σω =(1) dim(Σ) Σd Uω =(1) dim(Σ)( Σdω Σω), \begin{aligned} d \int_\Sigma \omega & = (-1)^{dim(\Sigma)} \int_\Sigma d_U \omega \\ & = (-1)^{dim(\Sigma)} \left( \int_\Sigma d \omega - \int_{\partial \Sigma} \omega \right) \end{aligned} \,,

where d=d U+d Σd = d_U + d_\Sigma is the de Rham differential on U×ΣU \times \Sigma (def. 12) and where the second equality is the Stokes theorem along Σ\Sigma:

Σd Σω= Σω. \int_\Sigma d_\Sigma \omega = \int_{\partial \Sigma} \omega \,.

\,

This concludes our review of the basics of (synthetic) differential geometry on which the following development of quantum field theory is based. In the next chapter we consider spacetime and spin.

\,

Spacetime

Relativistic field theory takes place on spacetime.

The concept of spacetime makes sense for every dimension p+1p+1 with pp \in \mathbb{N}. The observable universe has macroscopic dimension 3+13+1, but quantum field theory generally makes sense also in lower and in higher dimensions. For instance quantum field theory in dimension 0+1 is the “worldline” theory of particles, also known as quantum mechanics; while quantum field theory in dimension >p+1\gt p+1 may be “KK-compactified” to an “effective” field theory in dimension p+1p+1 which generally looks more complicated than its higher dimensional incarnation.

However, every realistic field theory, and also most of the non-realistic field theories of interest, contain spinor fields such as the Dirac field (example 43 below) and the precise nature and behaviour of spinors does depend sensitively on spacetime dimension. In fact the theory of relativistic spinors is mathematically most natural in just the following four spacetime dimensions:

p+1=AAAAA2+1, 3+1, 5+1, 9+1 p +1 = \phantom{AAAAA} \array{ 2+1,\; & 3+1,\; & \, & 5+1,\; &\, & \, & \, & \, 9+1 }

In the literature one finds these four dimensions advertized for two superficially unrelated reasons:

  1. in precisely these dimensions “twistors” exist (see there);

  2. in precisely these dimensions “GS-superstrings” exist (see there).

However, both these explanations have a common origin in something simpler and deeper: Spacetime in these dimensions appears from the “Pauli matrices” with entries in the real normed division algebras. (In fact it goes deeper still, but this will not concern us here.)

This we explain now, and then we use this to obtain a slick handle on spinors in these dimensions, using simple linear algebra over the four real normed division algebras. At the end (in remark 7) we give a dictionary that expresses these constructions in terms of the “two-component spinor notation” that is traditionally used in physics texts (remark 7 below).

The relation between real spin representations and division algebras, is originally due to Kugo-Townsend 82, Sudbery 84 and others. We follow the streamlined discussion in Baez-Huerta 09 and Baez-Huerta 10.

A key extra structure that the spinors impose on the underlying Cartesian space of spacetime is its causal structure, which determines which points in spacetime (“events”) are in the future or the past of other points (def. 29 below). This causal structure will turn out to tightly control the quantum field theory on spacetime in terms of the “causal additivity of the S-matrix” (prop. 91 below) and the induced “causal locality” of the algebra of quantum observables (prop. 94 below). To prepare the discussion of these constructions, we end this chapter with some basics on the causal structure of Minkowski spacetime.

\,

  1. Real division algebras

  2. Spacetime in dimensions 3, 4, 6 and 10

  3. Lorentz group and Spin group

  4. Spinors in dimensions 3, 4, 6 and 10

  5. Causal structure

\,

Real division algebras

To amplify the following pattern and to fix our notation for algebra generators, recall these definitions:

Definition

(complex numbers)

The complex numbers \mathbb{C} is the commutative algebra over the real numbers \mathbb{R} which is generated from one generators {e 1}\{e_1\} subject to the relation

  • (e 1) 2=1(e_1)^2 = -1.
Definition

(quaternions)

The quaternions \mathbb{H} is the associative algebra over the real numbers which is generated from three generators {e 1,e 2,e 3}\{e_1, e_2, e_3\} subject to the relations

quaternion multiplication table
  1. for all ii

    (e i) 2=1(e_i)^2 = -1

  2. for (i,j,k)(i,j,k) a cyclic permutation of (1,2,3)(1,2,3) then

    1. e ie j=e ke_i e_j = e_k

    2. e je i=e ke_j e_i = -e_k

(graphics grabbed from Baez 02)

Definition

(octonions)

The octonions 𝕆\mathbb{O} is the nonassociative algebra over the real numbers which is generated from seven generators {e 1,,e 7}\{e_1, \cdots, e_7\} subject to the relations

octonion multiplication table
  1. for all ii

    (e i) 2=1(e_i)^2 = -1

  2. for e ie je ke_i \to e_j \to e_k an edge or circle in the diagram shown (a labeled version of the Fano plane) then

    1. e ie j=e ke_i e_j = e_k

    2. e je i=e ke_j e_i = -e_k

    and all relations obtained by cyclic permutation of the indices in these equations.

(graphics grabbed from Baez 02)

One defines the following operations on these real algebras:

Definition

(conjugation, real part, imaginary part and absolute value)

For 𝕂{,,,𝕆}\mathbb{K} \in \{\mathbb{R}, \mathbb{C}, \mathbb{H}, \mathbb{O}\}, let

() *:𝕂𝕂 (-)^\ast \;\colon\; \mathbb{K} \longrightarrow \mathbb{K}

be the antihomomorphism of real algebras

(ra) *=ra * ,forr,a𝕂 (ab) *=b *a * ,fora,b𝕂 \begin{aligned} (r a)^\ast = r a^\ast &, \text{for}\;\; r \in \mathbb{R}, a \in \mathbb{K} \\ (a b)^\ast = b^\ast a^\ast &,\text{for}\;\; a,b \in \mathbb{K} \end{aligned}

given on the generators of def. 16, def. 17 and def. 18 by

(e i) *=e i. (e_i)^\ast = - e_i \,.

This operation makes 𝕂\mathbb{K} into a star algebra. For the complex numbers \mathbb{C} this is called complex conjugation, and in general we call it conjugation.

Let then

Re:𝕂 Re \;\colon\; \mathbb{K} \longrightarrow \mathbb{R}

be the function

Re(a)12(a+a *) Re(a) \;\coloneqq\; \tfrac{1}{2}(a + a^\ast)

(“real part”) and

Im:𝕂 Im \;\colon\; \mathbb{K} \longrightarrow \mathbb{R}

be the function

Im(a)12(aa *) Im(a) \;\coloneqq \; \tfrac{1}{2}(a - a^\ast)

(“imaginary part”).

It follows that for all a𝕂a \in \mathbb{K} then the product of a with its conjugate is in the real center of 𝕂\mathbb{K}

aa *=a *a𝕂 a a^\ast = a^\ast a \;\in \mathbb{R} \hookrightarrow \mathbb{K}

and we write the square root of this expression as

|a|aa * {\vert a\vert} \;\coloneqq\; \sqrt{a a^\ast}

called the norm or absolute value function

||:𝕂. {\vert -\vert} \;\colon\; \mathbb{K} \longrightarrow \mathbb{R} \,.

This norm operation clearly satisfies the following properties (for all a,b𝕂a,b \in \mathbb{K})

  1. |a|0\vert a \vert \geq 0;

  2. |a|=0a=0{\vert a \vert } = 0 \;\;\;\;\; \Leftrightarrow\;\;\;\;\;\; a = 0;

  3. |ab|=|a||b|{\vert a b \vert } = {\vert a \vert} {\vert b \vert}

and hence makes 𝕂\mathbb{K} a normed algebra.

Since \mathbb{R} is a division algebra, these relations immediately imply that each 𝕂\mathbb{K} is a division algebra, in that

ab=0a=0orb=0. a b = 0 \;\;\;\;\;\; \Rightarrow \;\;\;\;\;\; a = 0 \;\; \text{or} \;\; b = 0 \,.

Hence the conjugation operation makes 𝕂\mathbb{K} a real normed division algebra.

Remark

(sequence of inclusions of real normed division algebras)

Suitably embedding the sets of generators in def. 16, def. 17 and def. 18 into each other yields sequences of real star-algebra inclusions

𝕆. \mathbb{R} \hookrightarrow \mathbb{C} \hookrightarrow \mathbb{H} \hookrightarrow \mathbb{O} \,.

For example for the first two inclusions we may send each generator to the generator of the same name, and for the last inclusion me may choose

1 1 e 1 e 3 e 2 e 4 e 3 e 6 \array{ 1 &\mapsto& 1 \\ e_1 &\mapsto & e_3 \\ e_2 &\mapsto& e_4 \\ e_3 &\mapsto& e_6 }
Proposition

(Hurwitz theorem: \mathbb{R}, \mathbb{C}, \mathbb{H} and 𝕆\mathbb{O} are the normed real division algebras)

The four algebras of real numbers \mathbb{R}, complex numbers \mathbb{C}, quaternions \mathbb{H} and octonions 𝕆\mathbb{O} from def. 16, def. 17 and def. 18 respectively, which are real normed division algebras via def. 19, are, up to isomorphism, the only real normed division algebras that exist.

Remark

(Cayley-Dickson construction and sedenions)

While prop. 5 says that the sequence from remark 1

𝕆 \mathbb{R} \hookrightarrow \mathbb{C} \hookrightarrow \mathbb{H} \hookrightarrow \mathbb{O}

is maximal in the category of real normed non-associative division algebras, there is a pattern that does continue if one disregards the division algebra property. Namely each step in this sequence is given by a construction called forming the Cayley-Dickson double algebra. This continues to an unbounded sequence of real nonassociative star-algebras

𝕆𝕊 \mathbb{R} \hookrightarrow \mathbb{C} \hookrightarrow \mathbb{H} \hookrightarrow \mathbb{O} \hookrightarrow \mathbb{S} \hookrightarrow \cdots

where the next algebra 𝕊\mathbb{S} is called the sedenions.

What actually matters for the following relation of the real normed division algebras to real spin representations is that they are also alternative algebras:

Definition

(alternative algebras)

Given any non-associative algebra AA, then the trilinear map

[,,]AAAA [-,-,-] \;-\; A \otimes A \otimes A \longrightarrow A

given on any elements a,b,cAa,b,c \in A by

[a,b,c](ab)ca(bc) [a,b,c] \coloneqq (a b) c - a (b c)

is called the associator (in analogy with the commutator [a,b]abba[a,b] \coloneqq a b - b a ).

If the associator is completely antisymmetric (in that for any permutation σ\sigma of three elements then [a σ 1,a σ 2,a σ 3]=(1) |σ|[a 1,a 2,a 3][a_{\sigma_1}, a_{\sigma_2}, a_{\sigma_3}] = (-1)^{\vert \sigma\vert} [a_1, a_2, a_3] for |σ|\vert \sigma \vert the signature of the permutation) then AA is called an alternative algebra.

If the characteristic of the ground field is different from 2, then alternativity is readily seen to be equivalent to the conditions that for all a,bAa,b \in A then

(aa)b=a(ab)and(ab)b=a(bb). (a a)b = a (a b) \;\;\;\;\; \text{and} \;\;\;\;\; (a b) b = a (b b) \,.

We record some basic properties of associators in alternative star-algebras that we need below:

Proposition

(properties of alternative star algebras)

Let AA be an alternative algebra (def. 20) which is also a star algebra. Then (using def. 19):

  1. the associator vanishes when at least one argument is real

    [Re(a),b,c] [Re(a),b,c]
  2. the associator changes sign when one of its arguments is conjugated

    [a,b,c]=[a *,b,c]; [a,b,c] = -[a^\ast,b,c] \,;
  3. the associator vanishes when one of its arguments is the conjugate of another

    [a,a *,b]=0; [a,a^\ast, b] = 0 \,;
  4. the associator is purely imaginary

    Re([a,b,c])=0. Re([a,b,c]) = 0 \,.
Proof

That the associator vanishes as soon as one argument is real is just the linearity of an algebra product over the ground ring.

Hence in fact

[a,b,c]=[Im(a),Im(b),Im(c)]. [a,b,c] = [Im(a), Im(b), Im(c)] \,.

This implies the second statement by linearity. And so follows the third statement by skew-symmetry:

[a,a *,b]=[a,a,b]=0. [a,a^\ast,b] = -[a,a,b] = 0 \,.

The fourth statement finally follows by this computation:

[a,b,c] * =[c *,b *,a *] =[c,b,a] =[a,b,c]. \begin{aligned} \,[ a, b, c]^\ast & = -[c^\ast, b^\ast, a^\ast] \\ & = -[c,b,a] \\ & = -[a,b,c] \end{aligned} \,.

Here the first equation follows by inspection and using that (ab) *=b *a *(a b)^\ast = b^\ast a^\ast, the second follows from the first statement above, and the third is the anti-symmetry of the associator.

It is immediate to check that:

Proposition

(\mathbb{R}, \mathbb{C}, \mathbb{H} and 𝕆\mathbb{O} are real alternative algebras)

The real algebras of real numbers, complex numbers, def. 16,quaternions def. 17 and octonions def. 18 are alternative algebras (def. 20).

Proof

Since the real numbers, complex numbers and quaternions are associative algebras, their associator vanishes identically. It only remains to see that the associator of the octonions is skew-symmetric. By linearity it is sufficient to check this on generators. So let e ie je ke_i \to e_j \to e_k be a circle or a cyclic permutation of an edge in the Fano plane. Then by definition of the octonion multiplication we have

(e ie j)e j =e ke j =e je k =e i =e i(e je j) \begin{aligned} (e_i e_j) e_j &= e_k e_j \\ &= - e_j e_k \\ & = -e_i \\ & = e_i (e_j e_j) \end{aligned}

and similarly

(e ie i)e j =e j =e ke i =e ie k =e i(e ie j). \begin{aligned} (e_i e_i ) e_j &= - e_j \\ &= - e_k e_i \\ &= e_i e_k \\ &= e_i (e_i e_j) \end{aligned} \,.

The analog of the Hurwitz theorem (prop. 5) is now this:

Proposition

(\mathbb{R}, \mathbb{C}, \mathbb{H} and 𝕆\mathbb{O} are precisely the alternative real division algebras)

The only division algebras over the real numbers which are also alternative algebras (def. 20) are the real numbers themselves, the complex numbers, the quaternions and the octonions from prop. 7.

This is due to (Zorn 30).

For the following, the key point of alternative algebras is this equivalent characterization:

Proposition

(alternative algebra detected on subalgebras spanned by any two elements)

A nonassociative algebra is alternative, def. 20, precisely if the subalgebra generated by any two elements is an associative algebra.

This is due to Emil Artin, see for instance (Schafer 95, p. 18).

Proposition 9 is what allows to carry over a minimum of linear algebra also to the octonions such as to yield a representation of the Clifford algebra on 9,1\mathbb{R}^{9,1}. This happens in the proof of prop. 15 below.

So we will be looking at a fragment of linear algebra over these four normed division algebras. To that end, fix the following notation and terminology:

Definition

(hermitian matrices with values in real normed division algebras)

Let 𝕂\mathbb{K} be one of the four real normed division algebras from prop. 5, hence equivalently one of the four real alternative division algebras from prop. 8.

Say that an n×nn \times n matrix with coefficients in 𝕂\mathbb{K}

AMat n×n(𝕂) A\in Mat_{n\times n}(\mathbb{K})

is a hermitian matrix if the transpose matrix (A t) ijA ji(A^t)_{i j} \coloneqq A_{j i} equals the componentwise conjugated matrix (def. 19):

A t=A *. A^t = A^\ast \,.

Hence with the notation

() (() t) * (-)^\dagger \coloneqq ((-)^t)^\ast

we have that AA is a hermitian matrix precisely if

A=A . A = A^\dagger \,.

We write Mat 2×2 her(𝕂)Mat_{2 \times 2}^{her}(\mathbb{K}) for the real vector space of hermitian matrices.

Definition

(trace reversal)

Let AMat 2×2 her(𝕂)A \in Mat_{2 \times 2}^{her}(\mathbb{K}) be a hermitian 2×22 \times 2 matrix as in def. 21. Its trace reversal is the result of subtracting its trace times the identity matrix:

A˜A(trA)1 n×n. \tilde A \;\coloneqq\; A - (tr A) 1_{n\times n} \,.

\,

Minkowski spacetime in dimensions 3,4,6 and 10

We now discover Minkowski spacetime of dimension 3,4,6 and 10, in terms of the real normed division algebras 𝕂\mathbb{K} from prop. 5, equivalently the real alternative division algebras from prop. 8: this is prop./def. 10 and def. 23 below.

Proposition/Definition

(Minkowski spacetime as real vector space of hermitian matrices in real normed division algebras)

Let 𝕂\mathbb{K} be one of the four real normed division algebras from prop. 5, hence one of the four real alternative division algebras from prop. 8.

Then the real vector space of 2×22 \times 2 hermitian matrices over 𝕂\mathbb{K} (def. 21) equipped with the inner product η\eta whose quadratic form || η 2{\vert -\vert^2_\eta} is the negative of the determinant operation on matrices is Minkowski spacetime:

(8) dim (𝕂)+1,1 ( dim (𝕂)+2,|| η 2) (Mat 2×2 her(𝕂),det). \begin{aligned} \mathbb{R}^{dim_{\mathbb{R}}(\mathbb{K})+1,1} & \coloneqq \left( \mathbb{R}^{dim_{\mathbb{R}(\mathbb{K})}+2} , {\vert -\vert^2_\eta} \right) & \coloneqq \left(Mat_{2 \times 2}^{her}(\mathbb{K}), -det \right) \end{aligned} \,.

hence

  1. 2,1\mathbb{R}^{2,1} for 𝕂=\mathbb{K} = \mathbb{R};

  2. 3,1\mathbb{R}^{3,1} for 𝕂=\mathbb{K} = \mathbb{C};

  3. 5,1\mathbb{R}^{5,1} for 𝕂=\mathbb{K} = \mathbb{H};

  4. 9,1\mathbb{R}^{9,1} for 𝕂=𝕆\mathbb{K} = \mathbb{O}.

Here we think of the vector space on the left as p,1\mathbb{R}^{p,1} with

pdim (𝕂)+1 p \coloneqq dim_{\mathbb{R}}(\mathbb{K})+1

equipped with the canonical coordinates labeled (x μ) μ=0 p(x^\mu)_{\mu = 0}^p.

As a linear map the identification is given by

(x 0,x 1,,x d1)(x 0+x 1 y y * x 0x 1)withyx 21+x 3e 1+x 4e 2++x 2+dim (𝕂)e dim (𝕂)1. (x^0, x^1, \cdots, x^{d-1}) \;\mapsto\; \left( \array{ x^0 + x^1 & y \\ y^\ast & x^0 - x^1 } \right) \;\;\; \text{with}\; y \coloneqq x^2 1 + x^3 e_1 + x^4 e_2 + \cdots + x^{2 + dim_{\mathbb{R}(\mathbb{K})}} \,e_{dim_{\mathbb{R}}(\mathbb{K})-1} \,.

This means that the quadratic form || η 2{\vert - \vert^2_\eta} is given on an element v=(v μ) μ=0 pv = (v^\mu)_{\mu = 0}^p by

|v| η 2=(v 0) 2+j=1p(x j) 2. {\vert v \vert}^2_{\eta} \;=\; - (v^0)^2 + \underoverset{j = 1}{p}{\sum} (x^j)^2 \,.

By the polarization identity the quadratic form || η 2{\vert - \vert^2_\eta} induces a bilinear form

η: p,1 p,1 \eta \;\colon\; \mathbb{R}^{p,1}\otimes \mathbb{R}^{p,1} \longrightarrow \mathbb{R}

given by

η(v 1,v 2) =η μνv 1 μv 1 ν v 1 0v 2 0+j=1pv 1 jv 2 j. \begin{aligned} \eta(v_1, v_2) & = \eta_{\mu \nu} v_1^\mu v_1^\nu \\ & \coloneqq - v_1^0 v_2^0 + \underoverset{j = 1}{p}{\sum} v_1^j v_2^j \end{aligned} \,.

This is called the Minkowski metric.

Finally, under the above identification the operation of trace reversal from def. 22 corresponds to time reversal in that

(x 0+x 1 y y * x 0x 1)˜=(x 0+x 1 y y * x 0x 1). \widetilde{ \left( \array{ x^0 + x^1 & y \\ y^\ast & x^0 - x^1 } \right) } \;=\; \left( \array{ -x^0 + x^1 & y \\ y^\ast & -x^0 - x^1 } \right) \,.
Proof

We need to check that under the given identification, the Minkowski norm-square is indeed given by minus the determinant on the corresponding hermitian matrices. This follows from the nature of the conjugation operation () *(-)^\ast from def. 19:

det(x 0+x 1 y y * x 0x 1) =(x 0+x 1)(x 0x 1)+yy * =(x 0) 2+i=1p(x i) 2. \begin{aligned} - det \left( \array{ x^0 + x^1 & y \\ y^\ast & x^0 - x^1 } \right) & = -(x^0 + x^1)(x^0 - x^1) + y y^\ast \\ & = -(x^0)^2 + \underoverset{i = 1}{p}{\sum} (x^i)^2 \end{aligned} \,.
Remark

(physical units of length)

As the term “metric” suggests, in application to physics, the Minkowski metric η\eta in prop./def. 10 is regarded as a measure of length: for vΓ x(T p,1)v \in \Gamma_x(T \mathbb{R}^{p,1}) a tangent vector at a point xx in Minkowski spacetime, interpreted as a displacement from event xx to event x+vx + v, then

  1. if η(v,v)>0\eta(v,v) \gt 0 then

    η(v,v) \sqrt{\eta(v,v)} \in \mathbb{R}

    is interpreted as a measure for the spatial distance between xx and x+vx + v;

  2. if η(v,v)<0\eta(v,v) \lt 0 then

    η(v,v) \sqrt{-\eta(v,v)} \in \mathbb{R}

    is interpreted as a measure for the time distance between xx and x+vx + v.

But for this to make physical sense, an operational prescription needs to be specified that tells the experimentor how the real number η(v,v)\sqrt{\eta(v,v)} is to be translated into an physical distance between actual events in the observable universe.

Such an operational prescription is called a physical unit of length. For example “centimetercmcm is a physical unit of length, another one is “femtometerfmfm.

The combined information of a real number η(v,v)\sqrt{\eta(v,v)} \in \mathbb{R} and a physical unit of length such as meter, jointly written

η(v,v)cm \sqrt{\eta(v,v)} \, cm

is a prescription for finding actual distance in the observable universe. Alternatively

η(v,v)fm \sqrt{\eta(v,v)} \, fm

is another prescription, that translates the same real number η(v,v)\sqrt{\eta(v,v)} into another physical distance.

But of course they are related, since physical units form a torsor over the group >0\mathbb{R}_{\gt 0} of non-negative real numbers, meaning that any two are related by a unique rescaling. For example

fm=10 13cm, fm = 10^{-13} cm \,,

with 10 13 >010^{-13} \in \mathbb{R}_{\gt 0}.

This means that once any one prescription of turning real numbers into spacetime distances is specified, then any other such prescription is obtained from this by rescaling these real numbers. For example

η(v,v)fm =(10 13η(v,v))cm =10 26η(v,v)cm. \begin{aligned} \sqrt{\eta(v,v)} \, fm & = \left( 10^{-13} \sqrt{\eta(v,v)}\right) \,cm \\ & = \sqrt{ 10^{-26} \eta(v,v) } \, cm \end{aligned} \,.

The point to notice here is that, via the last line, we may think of this as rescaling the metric from η\eta to 10 30η10^{-30} \eta.

In quantum field theory physical units of length are typically expressed in terms of a physical unit of “action”, called “Planck's constant\hbar, via the combination of units called the Compton wavelength

(9) m=2πmc. \ell_m = \frac{2\pi \hbar}{m c} \,.

parameterized, in turn, by a physical unit of mass mm. For the mass of the electron, the Compton wavelength is

e=2πm ec386fm. \ell_e = \frac{2\pi \hbar}{m_e c} \sim 386 \, fm \,.

Another physical unit of length parameterized by a mass mm is the Schwarzschild radius r m2mG/c 2r_m \coloneqq 2 m G/c^2, where GG is the gravitational constant. Solving the equation

m = r m 2π/mc = 2mG/c 2 \array{ & \ell_m &=& r_m \\ \Leftrightarrow & 2\pi\hbar / m c &=& 2 m G / c^2 }

for mm yields the Planck mass

m P1πm =r=cG. m_{P} \coloneqq \tfrac{1}{\sqrt{\pi}} m_{\ell = r} = \sqrt{\frac{\hbar c}{G}} \,.

The corresponding Compton wavelength m P\ell_{m_{P}} is given by the Planck length P\ell_P

P12π m P=Gc 3. \ell_{P} \coloneqq \tfrac{1}{2\pi} \ell_{m_P} = \sqrt{ \frac{\hbar G}{c^3} } \,.
Definition

(Minkowski spacetime as a pseudo-Riemannian Cartesian space)

Prop./def. 10 introduces Minkowski spacetime p,1\mathbb{R}^{p,1} for p+1{3,4,6,10}p+1 \in \{3,4,6,10\} as a a vector space p,1\mathbb{R}^{p,1} equipped with a norm || η{\vert - \vert_\eta}. The genuine spacetime corresponding to this is this vector space regaded as a Cartesian space, i.e. with smooth functions (instead of just linear maps) to it and from it (def. 1). This still carries one copy of p,1\mathbb{R}^{p,1} over each point x p,1x \in \mathbb{R}^{p,1}, as its tangent space (example 5)

T x p,1 p,1 T_x \mathbb{R}^{p,1} \simeq \mathbb{R}^{p,1}

and the Cartesian space p,1\mathbb{R}^{p,1} equipped with the Lorentzian inner product from prop./def. 10 on each tangent space T x p,1T_x \mathbb{R}^{p,1} (a “pseudo-Riemannian Cartesian space”) is Minkowski spacetime as such.

We write

(10)dvol Σdx 0dx 1dx pΩ p+1( p,1) dvol_\Sigma \;\coloneqq\; d x^0 \wedge d x^1 \wedge \cdots \wedge d x^p \in \Omega^{p+1}(\mathbb{R}^{p,1})

for the canonical volume form on Minkowski spacetime.

We use the Einstein summation convention: Expressions with repeated indices indicate summation over the range of indices.

For example a differential 1-form αΩ 1( p,1)\alpha \in \Omega^1(\mathbb{R}^{p,1}) on Minkowski spacetime may be expanded as

α=α μdx μ. \alpha = \alpha_\mu d x^\mu \,.

Moreover we use square brackets around indices to indicate skew-symmetrization. For example a differential 2-form βΩ 2( p,1)\beta \in \Omega^2(\mathbb{R}^{p,1}) on Minkowski spacetime may be expanded as

β =β μνdx μdx ν =β [μν]dx μdx ν \begin{aligned} \beta & = \beta_{\mu \nu} d x^\mu \wedge d x^\nu \\ & = \beta_{[\mu \nu]} d x^\mu \wedge d x^\nu \end{aligned}

\,

The identification of Minkowski spacetime (def. 23) in the exceptional dimensions with the generalized Pauli matrices (prop./def. 10) has some immediate useful implications:

Proposition

(Minkowski metric in terms of trace reversal)

In terms of the trace reversal operation ()˜\widetilde{(-)} from def. 22, the determinant operation on hermitian matrices (def. 21) has the following alternative expression

det(A) =AA˜ =A˜A. \begin{aligned} -det(A) & = A \tilde A \\ & = \tilde A A \end{aligned} \,.

and the Minkowski inner product from prop. 10 has the alternative expression

η(A,B) =12Re(tr(AB˜)) =12Re(tr(A˜B)). \begin{aligned} \eta(A,B) & = \tfrac{1}{2}Re(tr(A \tilde B)) \\ & = \tfrac{1}{2} Re(tr(\tilde A B)) \end{aligned} \,.

(Baez-Huerta 09, prop. 5)

Proposition

(special linear group SL(2,𝕂)SL(2,\mathbb{K}) acts by linear isometries on Minkowski spacetime )

For 𝕂{,,,𝕆}\mathbb{K} \in \{\mathbb{R}, \mathbb{C}, \mathbb{H}, \mathbb{O}\} one of the four real normed division algebras (prop. 5) the special linear group SL(2,𝕂)SL(2,\mathbb{K}) acts on Minkowski spacetime p,1\mathbb{R}^{p,1} in dimension p+1{2+1,3+1,5+1.9+1}p+1 \in \{2+1, \,3+1, \, 5+1. \, 9+1\} (def. 23) by linear isometries given under the identification with the Pauli matrices in prop./def. 10 by conjugation:

SL(2,𝕂)× dim(𝕂+1,1) SL(2,𝕂)×Mat 2×2 herm(𝕂) Mat 2×2 herm(𝕂) dim(𝕂+1,1) (G,A) GAG \array{ SL(2,\mathbb{K}) \times \mathbb{R}^{dim(\mathbb{K}+1,1)} & \simeq & SL(2, \mathbb{K}) \times Mat^{herm}_{2 \times 2}(\mathbb{K}) &\overset{}{\longrightarrow}& Mat^{herm}_{2 \times 2}(\mathbb{K}) & \simeq & \mathbb{R}^{dim(\mathbb{K}+1,1)} \\ && (G, A) &\mapsto& G \, A \, G^\dagger }
Proof

For 𝕂{,,}\mathbb{K} \in \{\mathbb{R}, \mathbb{C}, \mathbb{H}\} this is immediate from matrix calculus, but we spell it out now. While the argument does not directly apply to the case 𝕂=𝕆\mathbb{K} = \mathbb{O} of the octonions, one can check that it still goes through, too.

First we need to see that the action is well defined. This follows from the associativity of matrix multiplication and the fact that forming conjugate transpose matrices is an antihomomorphism: (G 1G 2) =G 2 G 1 (G_1 G_2)^\dagger = G_2^\dagger G_1^\dagger. In particular this implies that the action indeed sends hermitian matrices to hermitian matrices:

(GAG ) =(G )=GA =AG =GAG . \begin{aligned} \left( G \, A \, G^\dagger \right)^\dagger & = \underset{= G}{\underbrace{\left( G^\dagger \right)}} \, \underset{= A}{\underbrace{A^\dagger}} \, G^\dagger \\ & = G \, A \, G^\dagger \end{aligned} \,.

By prop./def. 10 such an action is an isometry precisely if it preserves the determinant. This follows from the multiplicative property of determinants: det(AB)=det(A)det(B)det(A B) = det(A) det(B) and their compativility with conjugate transposition: det(A )=det(A *)det(A^\dagger) = det(A^\ast), and finally by the assumption that GSL(2,𝕂)G \in SL(2,\mathbb{K}) is an element of the special linear group, hence that its determinant is 1𝕂1 \in \mathbb{K}:

det(GAG ) =det(G)=1det(A)det(G )=1 *=1 =det(A). \begin{aligned} det\left( G \, A \, G^\dagger \right) & = \underset{ = 1}{\underbrace{det(G)}} \, det(A) \, \underset{= 1^\ast = 1}{\underbrace{det(G^\dagger)}} \\ & = det(A) \end{aligned} \,.

In fact the special linear groups of linear isometries in prop. 12 are the spin groups (def. 26 below) in these dimensions.

exceptional spinors and real normed division algebras

Lorentzian
spacetime
dimension
AA\phantom{AA}spin groupnormed division algebra\,\, brane scan entry
3=2+13 = 2+1Spin(2,1)SL(2,)Spin(2,1) \simeq SL(2,\mathbb{R})A\phantom{A} \mathbb{R} the real numberssuper 1-brane in 3d
4=3+14 = 3+1Spin(3,1)SL(2,)Spin(3,1) \simeq SL(2, \mathbb{C})A\phantom{A} \mathbb{C} the complex numberssuper 2-brane in 4d
6=5+16 = 5+1Spin(5,1)SL(2,)Spin(5,1) \simeq SL(2, \mathbb{H})A\phantom{A} \mathbb{H} the quaternionslittle string
10=9+110 = 9+1Spin(9,1)"SL(2,𝕆)"Spin(9,1) {\simeq} \text{"}SL(2,\mathbb{O})\text{"}A\phantom{A} 𝕆\mathbb{O} the octonionsheterotic/type II string

This we explain now.

\,

Lorentz group and spin group

Definition

(Lorentz group)

For dd \in \mathbb{N}, write

O(d1,1)GL( d) O(d-1,1) \hookrightarrow GL(\mathbb{R}^d)

for the subgroup of the general linear group on those linear maps AA which preserve this bilinear form on Minkowski spacetime (def 23), in that

η(A(),A())=η(,). \eta(A(-),A(-)) = \eta(-,-) \,.

This is the Lorentz group in dimension dd.

The elements in the Lorentz group in the image of the special orthogonal group SO(d1)O(d1,1)SO(d-1) \hookrightarrow O(d-1,1) are rotations in space. The further elements in the special Lorentz group SO(d1,1)SO(d-1,1), which mathematically are “hyperbolic rotations” in a space-time plane, are called boosts in physics.

One distinguishes the following further subgroups of the Lorentz group O(d1,1)O(d-1,1):

  • the proper Lorentz group

    SO(d1,1)O(d1,1) SO(d-1,1) \hookrightarrow O(d-1,1)

    is the subgroup of elements which have determinant +1 (as elements SO(d1,1)GL(d)SO(d-1,1)\hookrightarrow GL(d) of the general linear group);

  • the proper orthochronous (or restricted) Lorentz group

    SO +(d1,1)SO(d1,1) SO^+(d-1,1) \hookrightarrow SO(d-1,1)

    is the further subgroup of elements AA which preserve the time orientation of vectors vv in that (v 0>0)((Av) 0>0)(v^0 \gt 0) \Rightarrow ((A v)^0 \gt 0).

Proposition

(connected component of Lorentz group)

As a smooth manifold, the Lorentz group O(d1,1)O(d-1,1) (def. 24) has four connected components. The connected component of the identity is the proper orthochronous Lorentz group SO +(3,1)SO^+(3,1) (def. 24). The other three components are

  1. SO +(d1,1)PSO^+(d-1,1)\cdot P

  2. SO +(d1,1)TSO^+(d-1,1)\cdot T

  3. SO +(d1,1)PTSO^+(d-1,1)\cdot P T,

where, as matrices,

Pdiag(1,1,1,,1) P \coloneqq diag(1,-1,-1, \cdots, -1)

is the operation of point reflection at the origin in space, where

Tdiag(1,1,1,,1) T \coloneqq diag(-1,1,1, \cdots, 1)

is the operation of reflection in time and hence where

PT=TP=diag(1,1,,1) P T = T P = diag(-1,-1, \cdots, -1)

is point reflection in spacetime.

The following concept of the Clifford algebra (def. 25) of Minkowski spacetime encodes the structure of the inner product space d1,1\mathbb{R}^{d-1,1} in terms of algebraic operation (“geometric algebra”), such that the action of the Lorentz group becomes represented by a conjugation action (example 7 below). In particular this means that every element of the proper orthochronous Lorentz group may be “split in half” to yield a double cover: the spin group (def. 26 below).

Definition

(Clifford algebra)

For dd \in \mathbb{N}, we write

Cl( d1,1) Cl(\mathbb{R}^{d-1,1})

for the /2\mathbb{Z}/2-graded associative algebra over \mathbb{R} which is generated from dd generators {Γ 0,Γ 1,Γ 2,,Γ d1}\{\Gamma_0, \Gamma_1, \Gamma_2, \cdots, \Gamma_{d-1}\} in odd degree (“Clifford generators”), subject to the relation

(11)Γ aΓ b+Γ bΓ a=2η ab \Gamma_{a} \Gamma_b + \Gamma_b \Gamma_a = - 2\eta_{a b}

where η\eta is the inner product of Minkowski spacetime as in def. 23.

These relations say equivalently that

Γ 0 2=+1 Γ i 2=1fori{1,,d1} Γ aΓ b=Γ bΓ aforab. \begin{aligned} & \Gamma_0^2 = +1 \\ & \Gamma_i^2 = -1 \;\; \text{for}\; i \in \{1,\cdots, d-1\} \\ & \Gamma_a \Gamma_b = - \Gamma_b \Gamma_a \;\;\; \text{for}\; a \neq b \end{aligned} \,.

We write

Γ a 1a p1p!permutationsσ(1) |σ|Γ a σ(1)Γ a σ(p) \Gamma_{a_1 \cdots a_p} \;\coloneqq\; \frac{1}{p!} \underset{{permutations \atop \sigma}}{\sum} (-1)^{\vert \sigma\vert } \Gamma_{a_{\sigma(1)}} \cdots \Gamma_{a_{\sigma(p)}}

for the antisymmetrized product of pp Clifford generators. In particular, if all the a ia_i are pairwise distinct, then this is simply the plain product of generators

Γ a 1a n=Γ a 1Γ a nifi,j(a ia j). \Gamma_{a_1 \cdots a_n} = \Gamma_{a_1} \cdots \Gamma_{a_n} \;\;\; \text{if} \; \underset{i,j}{\forall} (a_i \neq a_j) \,.

Finally, write

()¯:Cl( d1,1)Cl( d1,1) \overline{(-)} \;\colon\; Cl(\mathbb{R}^{d-1,1}) \longrightarrow Cl(\mathbb{R}^{d-1,1})

for the algebra anti-automorphism given by

Γ a¯Γ a \overline{\Gamma_a} \coloneqq \Gamma_a
Γ aΓ b¯Γ bΓ a. \overline{\Gamma_a \Gamma_b} \coloneqq \Gamma_b \Gamma_a \,.
Remark

(vectors inside Clifford algebra)

By construction, the vector space of linear combinations of the generators in a Clifford algebra Cl( d1,1)Cl(\mathbb{R}^{d-1,1}) (def. 25) is canonically identified with Minkowski spacetime d1,1\mathbb{R}^{d-1,1} (def. 23)

()^: d1,1Cl( d1,1) \widehat{(-)} \;\colon\; \mathbb{R}^{d-1,1} \hookrightarrow Cl(\mathbb{R}^{d-1,1})

via

x aΓ a, x_a \mapsto \Gamma_a \,,

hence via

v=v ax xv^=v aΓ a, v = v^a x_x \mapsto \hat v = v^a \Gamma_a \,,

such that the defining quadratic form on d1,1\mathbb{R}^{d-1,1} is identified with the anti-commutator in the Clifford algebra

η(v 1,v 2)=12(v^ 1v^ 2+v^ 2v^ 1), \eta(v_1,v_2) = -\tfrac{1}{2}( \hat v_1 \hat v_2 + \hat v_2 \hat v_1) \,,

where on the right we are, in turn, identifying \mathbb{R} with the linear span of the unit in Cl( d1,1)Cl(\mathbb{R}^{d-1,1}).

The key point of the Clifford algebra (def. 25) is that it realizes spacetime reflections, rotations and boosts via conjugation actions:

Example

(Clifford conjugation)

For dd \in \mathbb{N} and d1,1\mathbb{R}^{d-1,1} the Minkowski spacetime of def. 23, let v d1,1v \in \mathbb{R}^{d-1,1} be any vector, regarded as an element v^Cl( d1,1)\hat v \in Cl(\mathbb{R}^{d-1,1}) via remark 4.

Then

  1. the conjugation action v^Γ a 1v^Γ a\hat v \mapsto -\Gamma_a^{-1} \hat v \Gamma_a of a single Clifford generator Γ a\Gamma_a on v^\hat v sends vv to its reflection at the hyperplane x a=0x_a = 0;

  2. the conjugation action

    v^exp(α2Γ ab)v^exp(α2Γ ab) \hat v \mapsto \exp(- \tfrac{\alpha}{2} \Gamma_{a b}) \hat v \exp(\tfrac{\alpha}{2} \Gamma_{a b})

    sends vv to the result of rotating it in the (a,b)(a,b)-plane through an angle α\alpha.

Proof

This is immediate by inspection:

For the first statement, observe that conjugating the Clifford generator Γ b\Gamma_b with Γ a\Gamma_a yields Γ b\Gamma_b up to a sign, depending on whether a=ba = b or not:

Γ a 1Γ bΓ a={Γ b |ifa=b Γ b |otherwise. - \Gamma_a^{-1} \Gamma_b \Gamma_a = \left\{ \array{ -\Gamma_b & \vert \text{if}\, a = b \\ \Gamma_b & \vert \text{otherwise} } \right. \,.

Therefore for v^=v bΓ b\hat v = v^b \Gamma_b then Γ a 1v^Γ a\Gamma_a^{-1} \hat v \Gamma_a is the result of multiplying the aa-component of vv by 1-1.

For the second statement, observe that

12[Γ ab,Γ c]=Γ aη bcΓ bη ac. -\tfrac{1}{2}[\Gamma_{a b}, \Gamma_c] = \Gamma_a \eta_{b c} - \Gamma_b \eta_{a c} \,.

This is the canonical action of the Lorentzian special orthogonal Lie algebra 𝔰𝔬(d1,1)\mathfrak{so}(d-1,1). Hence

exp(α2Γ ab)v^exp(α2Γ ab)=exp(12[Γ ab,])(v^) \exp(-\tfrac{\alpha}{2} \Gamma_{ab}) \hat v \exp(\tfrac{\alpha}{2} \Gamma_{ab}) = \exp(\tfrac{1}{2}[\Gamma_{a b}, -])(\hat v)

is the rotation action as claimed.

Remark

Since the reflections, rotations and boosts in example 7 are given by conjugation actions, there is a crucial ambiguity in the Clifford elements that induce them:

  1. the conjugation action by Γ a\Gamma_a coincides precisely with the conjugation action by Γ a-\Gamma_a;

  2. the conjugation action by exp(α4Γ ab)\exp(\tfrac{\alpha}{4} \Gamma_{a b}) coincides precisely with the conjugation action by exp(α2Γ ab)-\exp(\tfrac{\alpha}{2}\Gamma_{a b}).

Definition

(spin group)

For dd \in \mathbb{N}, the spin group Spin(d1,1)Spin(d-1,1) is the group of even graded elements of the Clifford algebra Cl( d1,1)Cl(\mathbb{R}^{d-1,1}) (def. 25) which are unitary with respect to the conjugation operation ()¯\overline{(-)} from def. 25:

Spin(d1,1){ACl( d1,1) even|A¯A=1}. Spin(d-1,1) \;\coloneqq\; \left\{ A \in Cl(\mathbb{R}^{d-1,1})_{even} \;\vert\; \overline{A} A = 1 \right\} \,.
Proposition

The function

Spin(d1,1)GL( d1,1) Spin(d-1,1) \longrightarrow GL(\mathbb{R}^{d-1,1})

from the spin group (def. 26) to the general linear group in dd-dimensions given by sending ASpin(d1,1)Cl( d1,1)A \in Spin(d-1,1) \hookrightarrow Cl(\mathbb{R}^{d-1,1}) to the conjugation action

A¯()A \overline{A}(-) A

(via the identification of Minkowski spacetime as the subspace of the Clifford algebra containing the linear combinations of the generators, according to remark 4)

is

  1. a group homomorphism onto the proper orthochronous Lorentz group (def. 24):

    Spin(d1,1)SO +(d1,1) Spin(d-1,1) \longrightarrow SO^+(d-1,1)
  2. exhibiting a /2\mathbb{Z}/2-central extension.

Proof

That the function is a group homomorphism into the general linear group, hence that it acts by linear transformations on the generators follows by using that it clearly lands in automorphisms of the Clifford algebra.

That the function lands in the Lorentz group O(d1,1)GL(d)O(d-1,1) \hookrightarrow GL(d) follows from remark 4:

η(A¯v 1A,A¯v 2A) =12((A¯v^ 1A)(A¯v^ 2A)+(A¯v^ 2A)(A¯v^ 1A)) =12(A¯(v^ 1v^ 2+v^ 2v^ 1)A) =A¯A12(v^ 1v^ 2+v^ 2v^ 1) =η(v 1,v 2). \begin{aligned} \eta(\overline{A}v_1A , \overline{A} v_2 A) &= \tfrac{1}{2} \left( \left(\overline{A} \hat v_1 A\right) \left(\overline{A}\hat v_2 A\right) + \left(\overline{A} \hat v_2 A\right) \left(\overline{A} \hat v_1 A\right) \right) \\ & = \tfrac{1}{2} \left( \overline{A}(\hat v_1 \hat v_2 + \hat v_2 \hat v_1) A \right) \\ & = \overline{A} A \tfrac{1}{2}\left( \hat v_1 \hat v_2 + \hat v_2 \hat v_1\right) \\ & = \eta(v_1, v_2) \end{aligned} \,.

That it moreover lands in the proper Lorentz group SO(d1,1)SO(d-1,1) follows from observing (example 7) that every reflection is given by the conjugation action by a linear combination of generators, which are excluded from the group Spin(d1,1)Spin(d-1,1) (as that is defined to be in the even subalgebra).

To see that the homomorphism is surjective, use that all elements of SO(d1,1)SO(d-1,1) are products of rotations in hyperplanes. If a hyperplane is spanned by the bivector (ω ab)(\omega^{a b}), then such a rotation is given, via example 7 by the conjugation action by

exp(α2ω abΓ ab) \exp(\tfrac{\alpha}{2} \omega^{a b}\Gamma_{a b})

for some α\alpha, hence is in the image.

That the kernel is /2\mathbb{Z}/2 is clear from the fact that the only even Clifford elements which commute with all vectors are the multiples aCl( d1,1)a \in \mathbb{R} \hookrightarrow Cl(\mathbb{R}^{d-1,1}) of the identity. For these a¯=a\overline{a} = a and hence the condition a¯a=1\overline{a} a = 1 is equivalent to a 2=1a^2 = 1. It is clear that these two elements {+1,1}\{+1,-1\} are in the center of Spin(d1,1)Spin(d-1,1). This kernel reflects the ambiguity from remark 5.

\,

Spinors in dimensions 3, 4, 6 and 10

We now discuss how real spin representations (def. 26) in spacetime dimensions 3,4, 6 and 10 are naturally induced from linear algebra over the four real alternative division algebras (prop. 5).

Definition

(Clifford algebra via normed division algebra)

Let 𝕂\mathbb{K} be one of the four real normed division algebras from prop. 5, hence one of the four real alternative division algebras from prop. 8.

Define a real linear map

Γ: dim (𝕂)+1,1End (𝕂 4) \Gamma \;\colon\; \mathbb{R}^{dim_{\mathbb{R}}(\mathbb{K})+1,1} \longrightarrow End_{\mathbb{R}}(\mathbb{K}^4)

from (the real vector space underlying) Minkowski spacetime to real linear maps on 𝕂 4\mathbb{K}^4

Γ(A)(ψ ϕ)(A˜ϕ Aψ). \Gamma(A) \left( \array{ \psi \\ \phi } \right) \;\coloneqq\; \left( \array{ - \tilde A \phi \\ A \psi } \right) \,.

Here on the right we are using the isomorphism from prop. 10 for identifying a spacetime vector with a 2×22 \times 2-matrix, and we are using the trace reversal (˜)\widetilde(-) from def. 22.

Remark

(Clifford multiplication via octonion-valued matrices)

Each operation of Γ(A)\Gamma(A) in def. 27 is clearly a linear map, even for 𝕂\mathbb{K} being the non-associative octonions. The only point to beware of is that for 𝕂\mathbb{K} the octonions, then the composition of two such linear maps is not in general given by the usual matrix product.

Proposition

(real spin representations via normed division algebras)

The map Γ\Gamma in def. 27 gives a representation of the Clifford algebra Cl( dim (𝕂+1,1))Cl(\mathbb{R}^{dim_{\mathbb{R}}}(\mathbb{K}+1,1) ) (this def.), i.e of

  1. Cl( 2,1)Cl(\mathbb{R}^{2,1}) for 𝕂=\mathbb{K} = \mathbb{R};

  2. Cl( 3,1)Cl(\mathbb{R}^{3,1}) for 𝕂=\mathbb{K} = \mathbb{C};

  3. Cl( 5,1)Cl(\mathbb{R}^{5,1}) for 𝕂=\mathbb{K} = \mathbb{H};

  4. Cl( 9,1)Cl(\mathbb{R}^{9,1}) for 𝕂=𝕆\mathbb{K} = \mathbb{O}.

Hence this Clifford representation induces representations of the spin group Spin(dim (𝕂)+1,1)Spin(dim_{\mathbb{R}}(\mathbb{K})+1,1) on the real vector spaces

S ±𝕂 2. S_{\pm } \coloneqq \mathbb{K}^2 \,.

and hence on

SS +S . S \coloneqq S_+ \oplus S_- \,.

(Baez-Huerta 09, p. 6)

Proof

We need to check that the Clifford relation

(Γ(A)) 2 =η(A,A)1 =+det(A) \begin{aligned} (\Gamma(A))^2 & = -\eta(A,A)1 \\ & = + det(A) \end{aligned}

is satisfied (where we used (11) and (8)). Now by definition, for any (ϕ,ψ)𝕂 4(\phi,\psi) \in \mathbb{K}^4 then

(Γ(A)) 2(ϕ ψ)=(A˜(Aϕ) A(A˜ψ)), (\Gamma(A))^2 \left( \array{ \phi \\ \psi } \right) \;=\; - \left( \array{ \tilde A(A \phi) \\ A(\tilde A \psi) } \right) \,,

where on the right we have in each component ordinary matrix product expressions.

Now observe that both expressions on the right are sums of triple products that involve either one real factor or two factors that are conjugate to each other:

A(A˜ψ) =(x 0+x 1 y y * x 0x 1)((x 0+x 1)ϕ 1+yϕ 2 y *ϕ 1(x 0+x 1)ϕ 2) =((x 0 2+x 1 2)ϕ 1+(x 0+x 1)(yϕ 2)+y(y *ϕ 1)y((x 0+x 1)ϕ 2) ). \begin{aligned} A (\tilde A \psi) & = \left( \array{ x_0 + x_1 & y \\ y^\ast & x_0 - x_1 } \right) \cdot \left( \array{ (-x_0 + x_1) \phi_1 + y \phi_2 \\ y^\ast \phi_1 - (x_0 + x_1)\phi_2 } \right) \\ & = \left( \array{ (-x_0^2 + x_1^2) \phi_1 + (x_0 + x_1)(y \phi_2) + y (y^\ast \phi_1) - y( (x_0 + x_1) \phi_2 ) \\ \cdots } \right) \end{aligned} \,.

Since the associators of triple products that involve a real factor and those involving both an element and its conjugate vanish by prop. 6 (hence ultimately by Artin’s theorem, prop. 9). In conclusion all associators involved vanish, so that we may rebracket to obtain

(Γ(A)) 2(ϕ ψ)=((A˜A)ϕ (AA˜)ψ). (\Gamma(A))^2 \left( \array{ \phi \\ \psi } \right) \;=\; - \left( \array{ (\tilde A A) \phi \\ (A \tilde A) \psi } \right) \,.

This implies the statement via the equality AA˜=A˜A=det(A)-A \tilde A = -\tilde A A = det(A) (prop. 11).

Proposition

(spinor bilinear pairings)

Let 𝕂\mathbb{K} be one of the four real normed division algebras and S ± 𝕂 2S_\pm \simeq_{\mathbb{R}}\mathbb{K}^2 the corresponding spin representation from prop. 15.

Then there are bilinear maps from two spinors (according to prop. 15) to the real numbers

()¯():S +S \overline{(-)}(-) \;\colon\; S_+ \otimes S_-\longrightarrow \mathbb{R}

as well as to dim(𝕂+1,1)\mathbb{R}^{dim(\mathbb{K}+1,1)}

()¯Γ():S ±S ± dim(𝕂+1,1) \overline{(-)}\Gamma (-) \;\colon\; S_\pm \otimes S_{\pm}\longrightarrow \mathbb{R}^{dim(\mathbb{K}+1,1)}

given, respectively, by forming the real part (def. 19) of the canonical 𝕂\mathbb{K}-inner product

()¯():S +S \overline{(-)}(-) \colon S_+\otimes S_- \longrightarrow \mathbb{R}
(ψ,ϕ)ψ¯ϕRe(ψ ϕ) (\psi,\phi)\mapsto \overline{\psi} \phi \coloneqq Re(\psi^\dagger \cdot \phi)

and by forming the product of a column vector with a row vector to produce a matrix, possibly up to trace reversal (def. 22) under the identification dim(𝕂)+1,1Mat 2×2 her(𝕂)\mathbb{R}^{dim(\mathbb{K})+1,1} \simeq Mat^{her}_{2 \times 2}(\mathbb{K}) from prop. 10:

S +S + dim(𝕂)+1,1 S_+ \otimes S_+ \longrightarrow \mathbb{R}^{dim(\mathbb{K})+1,1}
(ψ,ϕ)ψ¯Γϕψϕ +ϕψ ˜ (\psi , \phi) \mapsto \overline{\psi}\Gamma \phi \coloneqq \widetilde{\psi \phi^\dagger + \phi \psi^\dagger}

and

S S dim(𝕂+1,1) S_- \otimes S_- \longrightarrow \mathbb{R}^{dim(\mathbb{K}+1,1)}
(ψ,ϕ)ψϕ +ϕψ (\psi , \phi) \mapsto {\psi \phi^\dagger + \phi \psi^\dagger}

For AMat 2×2 her(𝕂)A \in Mat^{her}_{2 \times 2}(\mathbb{K}) the AA-component of this map is

η(ψ¯Γϕ,A)=Re(ψ (Aϕ)). \eta(\overline{\psi}\Gamma \phi, A) = Re (\psi^\dagger (A\phi)) \,.

These pairings have the following properties

  1. both are Spin(dim(𝕂)+1,1)Spin(dim(\mathbb{K})+1,1)-equivalent;

  2. the pairing ()¯Γ()\overline{(-)}\Gamma(-) is symmetric:

    (12)ψ 1¯Γψ 2=+ϕ 2¯Γψ 1AAAAforAAψ 1,ψ 2S +S \overline{\psi_1} \,\Gamma\, \psi_2 = + \overline{\phi_2}\, \Gamma\, \psi_1 \phantom{AAAA} \text{for} \phantom{AA} \psi_1, \psi_2 \in S_+ \oplus S_-

(Baez-Huerta 09, prop. 8, prop. 9).

Remark

(two-component spinor notation)

In the physics/QFT literature the expressions for spin representations given by prop. 15 are traditionally written in two-component spinor notation as follows:

  • An element of S +S_+ is denoted (χ a𝕂) a=1,2(\chi_a \in \mathbb{K})_{a = 1,2} and called a left handed spinor;

  • an element of S S_- is denoted (ξ a˙) a˙=1,2(\xi^{\dagger \dot a})_{\dot a = 1,2} and called a right handed spinor;

  • an element of S=S +S S = S_+ \oplus S_- is denoted

    (13)(ψ α)=((χ a),(ξ a˙)) (\psi^\alpha) = \left( (\chi_a), (\xi^{\dagger \dot a}) \right)

    and called a Dirac spinor;

and the Clifford action of prop. 27 corresponds to the generalized “Pauli matrices”:

  • a hermitian matrix AMat 2×2 her(𝕂)A \in Mat^{her}_{2\times 2}(\mathbb{K}) as in prop 10 regarded as a linear map S S +S_- \to S_+ via def. 27 is denoted

    (x μσ aa˙ μ)(x 0+x 1 y y * x 0x 1); \left(x_\mu \sigma^\mu_{a \dot a}\right) \;\coloneqq\; \left( \array{ x_0 + x_1 & y \\ y^\ast & x_0 - x_1 } \right) \,;
  • the negative of the trace-reversal (def. 22) of such a hermitian matrix, regarded as a linear map S +S S_+ \to S_-, is denoted

    (x μσ˜ μa˙a)(x 0+x 1 y y * x 0x 1). \left( x_\mu \widetilde \sigma^{\mu \dot a a} \right) \;\coloneqq\; - \left( \array{ -x_0 + x_1 & y \\ y^\ast & -x_0 - x_1 } \right) \,.
  • the corresponding Clifford generator Γ(A):S +S S +S \Gamma(A) \;\colon\; S_+ \oplus S_- \to S_+ \oplus S_- (def. 27) is denoted

    x μ(γ μ) αβ(0 x μσ ab˙ μ x μσ˜ μa˙b) x_\mu (\gamma^\mu)_{\alpha \beta} \;\coloneqq\; \left( \array{ 0 & x_\mu \sigma^\mu_{a \dot b} \\ x_\mu \widetilde \sigma^{\mu \dot a b} } \right)
  • the bilinear spinor-to-vector pairing from prop. 16 is written as the matrix multiplication

    (ψ¯γ μϕ)ψ¯Γϕ, \left( \overline{\psi} \, \gamma^\mu \, \phi\right) \;\coloneqq\; \overline{\psi}\,\Gamma \,\phi \,,

    where the Dirac conjugate ψ¯\overline{\psi} on the left is given on (ψ α)=(χ a,ξ c˙)(\psi_\alpha) = (\chi_a, \xi^{\dagger \dot c}) by

    (14)ψ¯ ψ γ 0 =(ξ a,χ a˙ ) \begin{aligned} \overline{\psi} & \coloneqq \psi^\dagger \gamma^0 \\ & = ( \xi^a, \chi^\dagger_{\dot a} ) \end{aligned}

    hence, with (13):

    (15)ψ 1¯γ μψ 2 =ψ 1 γ 0γ μψ 2 =(ξ 1) aσ ac˙ μ(ξ 2) c˙+(χ 1) a˙ σ˜ μa˙c(χ 2) c \begin{aligned} \overline{\psi_1} \,\gamma^\mu\, \psi_2 & = \psi_1^\dagger \, \gamma^0 \gamma^\mu \, \psi_2 \\ & = (\xi_1)^a \, \sigma^\mu_{a \dot c}\, (\xi_2)^{\dagger \dot c} + (\chi_1)^\dagger_{\dot a} \, \widetilde \sigma^{\mu \dot a c} \, (\chi_2)_c \end{aligned}

Finally, it is common to abbreviate contractions with the Clifford algebra generators (γ μ)(\gamma^\mu) by a slash, as in

k/γ μk μ k\!\!\!/\, \;\coloneqq\; \gamma^\mu k_\mu

or

(16)i/iγ μx μ. i \partial\!\!\!/\, \;\coloneqq\; i \gamma^\mu \frac{\partial}{\partial x^\mu} \,.

This is called the Feynman slash notation.

(e.g. Dermisek I-8, Dermisek I-9)

Below we spell out the example of the Lagrangian field theory of the Dirac field in detail (example 43). For discussion of massive chiral spinor fields one also needs the following, here we just mention this for completeness:

Proposition

(chiral spinor mass pairing)

In dimension 2+1 and 3+1, there exists a non-trivial skew-symmetric pairing

ϵ:SS \epsilon \;\colon\; S \wedge S \longrightarrow \mathbb{R}

which may be normalized such that in the two-component spinor basis of remark 7 we have

(17)σ˜ μa˙a=ϵ abϵ a˙b˙σ bb˙ μ. \tilde \sigma^{\mu \dot a a} = \epsilon^{a b} \epsilon^{\dot a \dot b} \sigma^\mu_{b \dot b} \,.
Proof

Take the non-vanishing components of ϵ\epsilon to be

ϵ 12=ϵ 1˙2˙=ϵ 21=ϵ 2˙1˙=1 \epsilon^{1 2} = \epsilon^{\dot 1 \dot 2} = \epsilon_{21} = \epsilon_{\dot 2 \dot 1} = 1

and

ϵ 21=ϵ 2˙1˙=ϵ 12=ϵ 1˙2˙=1. \epsilon^{2 1} = \epsilon^{\dot 2 \dot 1} = \epsilon_{1 2} = \epsilon_{\dot 1 \dot 2} = -1 \,.

With this equation (17) is checked explicitly. It is clear that ϵ\epsilon thus defined is skew symmetric as long as the component algebra is commutative, which is the case for 𝕂\mathbb{K} being \mathbb{R} or \mathbb{C}.

\,

Causal structure

We need to consider the following concepts and constructions related to the causal structure of Minkowski spacetime Σ\Sigma (def. 23).

Definition

(spacelike, timelike, lightlike directions; past and future)

Given two points x,yΣx,y \in \Sigma in Minkowski spacetime (def. 23), write

vyx p,1 v \coloneqq y - x \in \mathbb{R}^{p,1}

for their difference, using the vector space structure underlying Minkowski spacetime.

Recall the Minkowski inner product η\eta on p,1\mathbb{R}^{p,1}, given by prop./def. 10. Then via remark 3 we say that the difference vector vv is

  1. spacelike if η(v,v)>0\eta(v,v) \gt 0,

  2. timelike if η(v,v)<0\eta(v,v) \lt 0,

  3. lightlike if η(v,v)=0\eta(v,v) = 0.

If vv is timelike or lightlike then we say that

  1. yy is in the future of xx if y 0x 00y^0 - x^0 \geq 0;

  2. yy is in the past of xx if y 0x 00y^0 - x^0 \leq 0.

Definition

(causal cones)

For xΣx \in \Sigma a point in spacetime (an event), we write

V +(x),V (x)Σ V^+(x), V^-(x) \subset \Sigma

for the subsets of events that are in the timelike future or in the timelike past of xx, respectively (def. 29) called the open future cone and open past cone, respectively, and

V¯ +(x),V¯ (x)Σ \overline{V}^+(x), \overline{V}^-(x) \subset \Sigma

for the subsets of events that are in the timelike or lightlike future or past, respectivel, called the closed future cone and closed past cone, respectively.

The union

J(x)V¯ +(x)V¯ (x) J(x) \coloneqq \overline{V}^+(x) \cup \overline{V}^-(x)

of the closed future cone and past cone is called the full causal cone of the event xx. Its boundary is the light cone.

More generally for SΣS \subset \Sigma a subset of events we write

V¯ ±(S)xSV¯ ±(x) \overline{V}^\pm(S) \;\coloneqq\; \underset{x \in S}{\cup} \overline{V}^{\pm}(x)

for the union of the future/past closed cones of all events in the subset.

Definition

(compactly sourced causal support)

Consider a vector bundle EΣE \overset{}{\to} \Sigma (def. 7) over Minkowski spacetime (def. 23).

Write Γ Σ(E)\Gamma_{\Sigma}(E) for the spaces of smooth sections (def. 5), and write

Γ cp(E) compact support Γ Σ,±cp(E) compactly sourced future/past support Γ Σ,scp(E) spacelike compact support Γ Σ,(f/p)cp(E) future/past compact support Γ Σ,tcp(E) timelike compact support \begin{aligned} \Gamma_{cp}(E) & \text{compact support} \\ \Gamma_{\Sigma,\pm cp}(E) & \text{compactly sourced future/past support} \\ \Gamma_{\Sigma,scp}(E) & \text{spacelike compact support} \\ \Gamma_{\Sigma,(f/p)cp}(E) & \text{future/past compact support} \\ \Gamma_{\Sigma,tcp}(E) & \text{timelike compact support} \end{aligned}

for the subsets on those smooth sections whose support is

  1. (cpcp) inside a compact subset,

  2. (±cp\pm cp) inside the closed future cone/closed past cone, respectively, of a compact subset,

  3. (scpscp) inside the closed causal cone of a compact subset, which equivalently means that the intersection with every (spacelike) Cauchy surface is compact (Sanders 13, theorem 2.2),

  4. (fcpfcp) inside the past of a Cauchy surface (Sanders 13, def. 3.2),

  5. (pcppcp) inside the future of a Cauchy surface (Sanders 13, def. 3.2),

  6. (tcptcp) inside the future of one Cauchy surface and the past of another (Sanders 13, def. 3.2).

(Bär 14, section 1, Khavkine 14, def. 2.1)

Definition

(causal order)

Consider the relation on the set P(Σ)P(\Sigma) of subsets of spacetime which says a subset S 1ΣS_1 \subset \Sigma is not prior to a subset S 2ΣS_2 \subset \Sigma, denoted S 1S 2S_1 \geq S_2, if S 1S_1 does not intersect the causal past of S 2S_2 (def. 30), or equivalently that S 2S_2 does not intersect the causal future of S 1S_1:

(S 1S 2) S 1V¯ (S 2)= S 2V¯ +(S 1)=. \begin{aligned} (S_1 \geq S_2) & \coloneqq S_1 \cap \overline{V}^-(S_2) = \emptyset \\ & \Leftrightarrow S_2 \cap \overline{V}^+(S_1) = \emptyset \end{aligned} \,.

If S 1S 2S_1 \geq S_2 and S 2S 1S_2 \geq S_1 we say that the two subsets are spacelike separated.

Definition

(causal complement and causal closure of subset of spacetime)

For SXS \subset X a subset of spacetime, its causal complement S S^\perp is the complement of the causal cone:

S SJ X(S). S^\perp \;\coloneqq\; S \setminus J_X(S) \,.

The causal complement S S^{\perp \perp} of the causal complement S S^\perp is called the causal closure. If

S=S S = S^{\perp \perp}

then the subset SS is called a causally closed subset.

Given a spacetime Σ\Sigma, we write

CausClsdSubsets(Σ)Cat CausClsdSubsets(\Sigma) \;\in\; Cat

for the partially ordered set of causally closed subsets, partially ordered by inclusion 𝒪 1𝒪 2\mathcal{O}_1 \subset \mathcal{O}_2.

Definition

(adiabatic switching)

For a causally closed subset 𝒪Σ\mathcal{O} \subset \Sigma of spacetime (def. 8) say that an adiabatic switching function or infrared cutoff function for 𝒪\mathcal{O} is a smooth function g swg_{sw} of compact support (a bump function) whose restriction to some neighbourhood UU of 𝒪\mathcal{O} is the constant function with value 11:

Cutoffs(𝒪){g swC c (Σ)|U𝒪neighbourhood(g sw| U=1)}. Cutoffs(\mathcal{O}) \;\coloneqq\; \left\{ g_{sw} \in C^\infty_c(\Sigma) \;\vert\; \underset{ {U \supset \mathcal{O}} \atop { \text{neighbourhood} } }{\exists} \left( g_{sw}\vert_U = 1 \right) \right\} \,.

Often we consider the vector space space C (Σ)gC^\infty(\Sigma)\langle g \rangle spanned by a formal variable gg (the coupling constant) under multiplication with smooth functions, and consider as adiabatic switching functions the corresponding images in this space,

C c (Σ) C c (X)g \array{ C_c^\infty(\Sigma) &\overset{\simeq}{\longrightarrow}& C_c^\infty(X)\langle g\rangle }

which are thus bump functions constant over a neighbourhood UU of 𝒪\mathcal{O} not on 1 but on the formal parameter gg:

g sw| U=g g_{sw}\vert_U = g \,

In this sense we may think of the adiabatic switching as being the spacetime-depependent coupling “constant”.

The following lemma 1 will be key in the derivation (proof of prop. 92 below) of the causal locality of algebra of quantum observables in perturbative quantum field theory:

Lemma

(causal partition)

Let 𝒪Σ\mathcal{O} \subset \Sigma be a causally closed subset (def. 8) and let fC cp (Σ)f \in C^\infty_{cp}(\Sigma) be a compactly supported smooth function which vanishes on a neighbourhood U𝒪U \supset \mathcal{O}, i.e. f| U=0f\vert_U = 0.

Then there exists a causal partition of ff in that there exist compactly supported smooth functions a,rC cp (Σ)a,r \in C^\infty_{cp}(\Sigma) such that

  1. they sum up to ff:

    f=a+r f = a + r
  2. their support satisfies the following causal ordering (def. 32)

    supp(a)𝒪supp(r). supp(a) \geq \mathcal{O} \geq supp(r) \,.
Proof idea

By assumption 𝒪\mathcal{O} has a Cauchy surface. This may be extended to a Cauchy surface Σ p\Sigma_p of Σ\Sigma, such that this is one leaf of a foliation of Σ\Sigma by Cauchy surfaces, given by a diffeomorphism Σ(1,1)×Σ p\Sigma \simeq (-1,1) \times \Sigma_p with the original Σ p\Sigma_p at zero. There exists then ϵ(0,1)\epsilon \in (0,1) such that the restriction of supp(f)supp(f) to the interval (ϵ,ϵ)(-\epsilon, \epsilon) is in the causal complement 𝒪¯\overline{\mathcal{O}} of the given region (def. 8):

supp(f)(ϵ,ϵ)×Σ p𝒪¯. supp(f) \cap (-\epsilon, \epsilon) \times \Sigma_p \;\subset\; \overline{\mathcal{O}} \,.

Let then χ:Σ\chi \colon \Sigma \to \mathbb{R} be any smooth function with

  1. χ| (1,0]×Σ p=1\chi\vert_{(-1,0] \times \Sigma_p} = 1

  2. χ| (ϵ,1)×Σ p=0\chi\vert_{(\epsilon,1) \times \Sigma_p} = 0.

Then

rχfAAAandAAAa(1χ)f r \coloneqq \chi \cdot f \phantom{AAA} \text{and} \phantom{AAA} a \coloneqq (1-\chi) \cdot f

are smooth functions as required.

\,

This concludes our discussion of spin and spacetime. In the next chapter we consider the concept of fields on spacetime.

\,

Fields

A field history on a given spacetime Σ\Sigma (a history of spatial field configurations, see remark 8 below) is a quantity assigned to each point of spacetime (each event), such that this assignment varies smoothly with spacetime points. For instance an electromagnetic field history (example 11 below) is at each point of spacetime a collection of vectors that encode the direction in which a charged particle passing through that point would feel a force (the “Lorentz force”, see example 11 below).

This is readily formalized (def. 34 below): If FF denotes the smooth manifold of “values” that the given kind of field may take at any spacetime point, then a field history Φ\Phi is modeled as a smooth function from spacetime to this space of values:

Φ:ΣF. \Phi \;\colon\; \Sigma \longrightarrow F \,.

It will be useful to unify spacetime and the space of field values (the field fiber) into a single manifold, the Cartesian product

EΣ×F E \;\coloneqq\; \Sigma \times F

and to think of this equipped with the projection map onto the first factor as a fiber bundle of spaces of field values over spacetime

E Σ×F fb pr 1 Σ. \array{ E &\coloneqq& \Sigma \times F \\ {}^{\mathllap{fb}}\downarrow & \swarrow_{\mathrlap{pr_1}} \\ \Sigma } \,.

This is then called the field bundle, which specifies the kind of values that the given field species may take at any point of spacetime. Since the space FF of field values is the fiber of this fiber bundle (def. 6), it is sometimes also called the field fiber. (See also at fiber bundles in physics.)

Given a field bundle EfbΣE \overset{fb}{\to}\Sigma, then a field history is a section of that bundle (def. 5). The discussion of field theory concerns the space of all possible field histories, hence the space of sections of the field bundle (example 16 below). This is a very “large” generalized smooth space, called a diffeological space (def. 35 below).

Or rather, in the presence of fermion fields such as the Dirac field (example 35 below), the Pauli exclusion principle demands that the field bundle is a super-manifold, and that the fermionic space of field histories (example 53 below) is a super-geometric generalized smooth space: a super smooth set (def. 48 below).

This smooth structure on the space of field histories will be crucial when we discuss observables of a field theory below, because these are smooth functions on the space of field histories. In particular it is this smooth structure which allows to derive that linear observables of a free field theory are given by distributions (prop. 37) below. Among these are the point evaluation observables (delta distributions) which are traditionally denoted by the same symbol as the field histories themselves.

Hence there are these aspects of the concept of “field” in physics, which are closely related, but crucially different:

\,

aspects of the concept of fields

aspecttermtypedescriptiondef.
field componentϕ a\phi^a, ϕ ,μ a\phi^a_{,\mu}J Σ (E)J^\infty_\Sigma(E) \to \mathbb{R}coordinate function on jet bundle of field bundledef. 34, def. 54
field historyΦ\Phi, Φx μ\frac{\partial \Phi}{\partial x^\mu}ΣJ Σ (E)\Sigma \to J^\infty_\Sigma(E)jet prolongation of section of field bundledef. 34, def. 55
field observableΦ a(x)\mathbf{\Phi}^a(x), μΦ a(x),\partial_{\mu} \mathbf{\Phi}^a(x), Γ Σ(E)\Gamma_{\Sigma}(E) \to \mathbb{R}derivatives of delta-functional on space of sectionsdef. 71, example 60
averaging of field observableα *Σα a *(x)Φ a(x)dvol Σ(x)\alpha^\ast \mapsto \underset{\Sigma}{\int} \alpha^\ast_a(x) \mathbf{\Phi}^a(x) \, dvol_\Sigma(x)Γ Σ,cp(E *)Obs(E scp,L)\Gamma_{\Sigma,cp}(E^\ast) \to Obs(E_{scp},\mathbf{L})observable-valued distributiondef. 80
algebra of quantum observables(Obs(E,L) μc,)\left( Obs(E,\mathbf{L})_{\mu c},\, \star\right)Alg\mathbb{C}Algnon-commutative algebra structure on field observablesdef. 127, def. 132

\,

We now discuss these topics:

\,

\,

field bundles

Definition

(fields and field histories)

Given a spacetime Σ\Sigma, then a type of fields on Σ\Sigma is a smooth fiber bundle (def. 6)

E fb Σ \array{E \\ \downarrow^{\mathrlap{fb}} \\ \Sigma }

called the field bundle,

Given a type of fields on Σ\Sigma this way, then a field history of that type on Σ\Sigma is a term of that type, hence is a smooth section (def. 5) of this bundle, namely a smooth function of the form

Φ:ΣE \Phi \;\colon\; \Sigma \longrightarrow E

such that composed with the projection map it is the identity function, i.e. such that

fbΦ=idAAAAAAA E Φ fb Σ = Σ. fb \circ \Phi = id \phantom{AAAAAAA} \array{ && E \\ & {}^{\mathllap{\Phi}}\nearrow & \downarrow^{\mathrlap{fb}} \\ \Sigma & = & \Sigma } \,.

The set of such sections/field histories is to be denoted

(18)Γ Σ(E){ E Φ fb Σ = Σfb} \Gamma_\Sigma(E) \;\coloneqq\; \left\{ \array{ && E \\ & {}^{\mathllap{\Phi}}\nearrow & \downarrow^{\mathrlap{fb}} \\ \Sigma &=& \Sigma } \phantom{fb} \right\}
Remark

(field histories are histories of spatial field configurations)

Given a section ΦΓ Σ(E)\Phi \in \Gamma_\Sigma(E) of the field bundle (def. 34) and given a spacelike (def. 29) submanifold Σ pΣ\Sigma_p \hookrightarrow \Sigma (def. 44) of spacetime in codimension 1, then the restriction Φ| Σ p\Phi\vert_{\Sigma_p} of Φ\Phi to Σ p\Sigma_p may be thought of as a field configuration in space. As different spatial slices Σ p\Sigma_p are chosen, one obtains such field configurations at different times. It is in this sense that the entirety of a section ΦΓ Σ(E)\Phi \in \Gamma_\Sigma(E) is a history of field configurations, hence a field history (def 34).

Remark

(possible field histories)

After we give the set Γ Σ(E)\Gamma_\Sigma(E) of field histories (18) differential geometric structure, below in example 16 and example 33, we call it the space of field histories. This should be read as space of possible field histories; containing all field histories that qualify as being of the type specified by the field bundle EE.

After we obtain equations of motion below in def. 61, these serve as the “laws of nature” that field histories should obey, and they define the subspace of those field histories that do solve the equations of motion; this will be denoted

Γ Σ(E) δ ELL=0AAAΓ Σ(E) \Gamma_\Sigma(E)_{\delta_{EL}\mathbf{L}= 0} \overset{\phantom{AAA}}{\hookrightarrow} \Gamma_\Sigma(E)

and called the on-shell space of field histories (41).

For the time being, not to get distracted from the basic idea of quantum field theory, we will focus on the following simple special case of field bundles:

Example

(trivial vector bundle as a field bundle)

In applications the field fiber F=VF = V is often a finite dimensional vector space. In this case the trivial field bundle with fiber FF is of course a trivial vector bundle (def. 7).

Choosing any linear basis (ϕ a) a=1 s(\phi^a)_{a = 1}^s of the field fiber, then over Minkowski spacetime (def. 23) we have canonical coordinates on the total space of the field bundle

((x μ),(ϕ a)), ( (x^\mu), ( \phi^a ) ) \,,

where the index μ\mu ranges from 00 to pp, while the index aa ranges from 1 to ss.

If this trivial vector bundle is regarded as a field bundle according to def. 34, then a field history Φ\Phi is equivalently an ss-tuple of real-valued smooth functions Φ a:Σ\Phi^a \colon \Sigma \to \mathbb{R} on spacetime:

Φ=(Φ a) a=1 s. \Phi = ( \Phi^a )_{a = 1}^s \,.
Example

(field bundle for real scalar field)

If Σ\Sigma is a spacetime and if the field fiber

F F \coloneqq \mathbb{R}

is simply the real line, then the corresponding trivial field bundle (def. 34)

Σ× pr 1 Σ \array{ \Sigma \times \mathbb{R} \\ {}^{\mathllap{pr_1}}\downarrow \\ \Sigma }

is the trivial real line bundle (a special case of example 9) and the corresponding field type (def. 34) is called the real scalar field on Σ\Sigma. A configuration of this field is simply a smooth function on Σ\Sigma with values in the real numbers:

(19)Γ Σ(Σ×)C (Σ). \Gamma_\Sigma(\Sigma \times \mathbb{R}) \;\simeq\; C^\infty(\Sigma) \,.
Example

(field bundle for electromagnetic field)

On Minkowski spacetime Σ\Sigma (def. 23), let the field bundle (def. 34) be given by the cotangent bundle

ET *Σ. E \coloneqq T^\ast \Sigma \,.

This is a trivial vector bundle (example 9) with canonical field coordinates (a μ)(a_\mu).

A section of this bundle, hence a field history, is a differential 1-form

AΓ Σ(T *Σ)=Ω 1(Σ) A \in \Gamma_\Sigma(T^\ast \Sigma) = \Omega^1(\Sigma)

on spacetime (def. 9). Interpreted as a field history of the electromagnetic field on Σ\Sigma, this is often called the vector potential. Then the de Rham differential (def. 12) of the vector potential is a differential 2-form

FdA F \coloneqq d A

known as the Faraday tensor. In the canonical coordinate basis 2-forms this may be expanded as

(20)F=i=1pE idx 0dx i+1i<jpB ijdx idx j. F = \underoverset{i = 1}{p}{\sum} E_i d x^0 \wedge d x^i + \underset{1 \leq i \lt j \leq p}{\sum} B_{i j} d x^i \wedge d x^j \,.

Here (E i) i=1 p(E_i)_{i = 1}^p are called the components of the electric field, while (B ij)(B_{i j}) are called the components of the magnetic field.

Example

(field bundle for Yang-Mills field over Minkowski spacetime)

Let 𝔤\mathfrak{g} be a Lie algebra of finite dimension with linear basis (t α)(t_\alpha), in terms of which the Lie bracket is given by

(21)[t α,t β]=γ γ αβt γ. [t_\alpha, t_\beta] \;=\; \gamma^\gamma{}_{\alpha \beta} t_\gamma \,.

Over Minkowski spacetime Σ\Sigma (def. 23), consider then the field bundle which is the cotangent bundle tensored with the Lie algebra 𝔤\mathfrak{g}

ET *Σ𝔤. E \coloneqq T^\ast \Sigma \otimes \mathfrak{g} \,.

This is the trivial vector bundle (example 9) with induced field coordinates

(a μ α). ( a_\mu^\alpha ) \,.

A section of this bundle is a Lie algebra-valued differential 1-form

AΓ Σ(T *Σ𝔤)=Ω 1(Σ,𝔤). A \in \Gamma_\Sigma(T^\ast \Sigma \otimes \mathfrak{g}) = \Omega^1(\Sigma, \mathfrak{g}) \,.

with components

A *(a μ α)=A μ α. A^\ast(a_\mu^\alpha) = A^\alpha_\mu \,.

This is called a field history for Yang-Mills gauge theory (at least if 𝔤\mathfrak{g} is a semisimple Lie algebra, see example 41 below).

For 𝔤=\mathfrak{g} = \mathbb{R} is the line Lie algebra, this reduces to the case of the electromagnetic field (example 11).

For 𝔤=𝔰𝔲(3)\mathfrak{g} = \mathfrak{su}(3) this is a field history for the gauge field of the strong nuclear force in quantum chromodynamics.

For readers familiar with the concepts of principal bundles and connections on a bundle we include the following example 13 which generalizes the Yang-Mills field over Minkowski spacetime from example 12 to the situation over general spacetimes.

Example

(general Yang-Mills field in fixed topological sector)

Let Σ\Sigma be any spacetime manifold and let GG be a compact Lie group with Lie algebra denoted 𝔤\mathfrak{g}. Let PisΣP \overset{is}{\to} \Sigma be a GG-principal bundle and 0\nabla_0 a chosen connection on it, to be called the background GG-Yang-Mills field.

Then the field bundle (def. 34) for GG-Yang-Mills theory in the topological sector PP is the tensor product of vector bundles

E(P× G ad𝔤) Σ(T *Σ) E \coloneqq \left(P \times^{ad}_G \mathfrak{g}\right) \otimes_\Sigma \left( T^\ast \Sigma \right)

of the adjoint bundle of PP and the cotangent bundle of Σ\Sigma.

With the choice of 0\nabla_0, every (other) connection \nabla on PP uniquely decomposes as

= 0+A, \nabla = \nabla_0 + A \,,

where

AΓ Σ(E) A \in \Gamma_\Sigma(E)

is a section of the above field bundle, hence a Yang-Mills field.

The electromagnetic field (def. 11) and the Yang-Mills field (def. 12, def. 13) with differential 1-forms as field histories are the basic examples of gauge fields (we consider this in more detail below in Gauge symmetries). There are also higher gauge fields with differential n-forms as field histories:

Example

(field bundle for B-field)

On Minkowski spacetime Σ\Sigma (def. 23), let the field bundle (def. 34) be given by the skew-symmetrized tensor product of vector bundles of the cotangent bundle with itself

E Σ 2T *Σ. E \coloneqq \wedge^2_\Sigma T^\ast \Sigma \,.

This is a trivial vector bundle (example 9) with canonical field coordinates (b μν)(b_{\mu \nu}) subject to

b μν=b νμ. b_{\mu \nu} \;=\; - b_{\nu \mu} \,.

A section of this bundle, hence a field history, is a differential 2-form (def. 11)

BΓ Σ( Σ 2T *Σ)=Ω 2(Σ) B \in \Gamma_\Sigma(\wedge^2_\Sigma T^\ast \Sigma) = \Omega^2(\Sigma)

on spacetime.

\,

space of field histories

Given any field bundle, we will eventually need to regard the set of all field histories Γ Σ(E)\Gamma_\Sigma(E) as a “smooth set” itself, a smooth space of sections, to which constructions of differential geometry apply (such as for the discussion of observables and states below ). Notably we need to be talking about differential forms on Γ Σ(E)\Gamma_\Sigma(E).

However, a space of sections Γ Σ(E)\Gamma_\Sigma(E) does not in general carry the structure of a smooth manifold; and it carries the correct smooth structure of an infinite dimensional manifold only if Σ\Sigma is a compact space (see at manifold structure of mapping spaces). Even if it does carry infinite dimensional manifold structure, inspection shows that this is more structure than actually needed for the discussion of field theory. Namely it turns out below that all we need to know is what counts as a smooth family of sections/field histories, hence which functions of sets

Φ (): nΓ Σ(E) \Phi_{(-)} \;\colon\; \mathbb{R}^n \longrightarrow \Gamma_\Sigma(E)

from any Cartesian space n\mathbb{R}^n (def. 1) into Γ Σ(E)\Gamma_\Sigma(E) count as smooth functions, subject to some basic consistency condition on this choice.

This structure on Γ Σ(E)\Gamma_\Sigma(E) is called the structure of a diffeological space:

Definition

(diffeological space)

A diffeological space XX is

  1. a set X sX_s \in Set;

  2. for each nn \in \mathbb{N} a choice of subset

    X( n)Hom Set( s n,X s)={ s nX s} X(\mathbb{R}^n) \subset Hom_{Set}(\mathbb{R}^n_s, X_s) = \left\{ \mathbb{R}^n_s \to X_s \right\}

    of the set of functions from the underlying set s n\mathbb{R}^n_s of n\mathbb{R}^n to X sX_s, to be called the smooth functions or plots from n\mathbb{R}^n to XX;

  3. for each smooth function f: n 1 n 2f \;\colon\; \mathbb{R}^{n_1} \longrightarrow \mathbb{R}^{n_2} between Cartesian spaces (def. 1) a choice of function

    f *:X( n 2)X( n 1) f^\ast \;\colon\; X(\mathbb{R}^{n_2}) \longrightarrow X(\mathbb{R}^{n_1})

    to be thought of as the precomposition operation

    ( n 2ΦX)f *( n 1f n 2ΦX) \left( \mathbb{R}^{n_2} \overset{\Phi}{\longrightarrow} X \right) \;\overset{f^\ast}{\mapsto}\; \left( \mathbb{R}^{n_1} \overset{f}{\to} \mathbb{R}^{n_2} \overset{\Phi}{\to} X \right)

such that

  1. (constant functions are smooth)

    X( 0)=X s, X(\mathbb{R}^0) = X_s \,,
  2. (functoriality)

    1. If id n: n nid_{\mathbb{R}^n} \;\colon\; \mathbb{R}^n \to \mathbb{R}^n is the identity function on n\mathbb{R}^n, then (id n) *:X( n)X( n)\left(id_{\mathbb{R}^n}\right)^\ast \;\colon\; X(\mathbb{R}^n) \to X(\mathbb{R}^n) is the identity function on the set of plots X( n)X(\mathbb{R}^n);

    2. If n 1f n 2g n 3\mathbb{R}^{n_1} \overset{f}{\to} \mathbb{R}^{n_2} \overset{g}{\to} \mathbb{R}^{n_3} are two composable smooth functions between Cartesian spaces (def. 1), then pullback of plots along them consecutively equals the pullback along the composition:

      f *g *=(gf) * f^\ast \circ g^\ast = (g \circ f)^\ast

      i.e.

      X( n 2) f * g * X( n 1) (gf) * X( n 3) \array{ && X(\mathbb{R}^{n_2}) \\ & {}^{\mathllap{f^\ast}}\swarrow && \nwarrow^{\mathrlap{g^\ast}} \\ X(\mathbb{R}^{n_1}) && \underset{ (g \circ f)^\ast }{\longleftarrow} && X(\mathbb{R}^{n_3}) }
  3. (gluing)

    If {U if i n} iI\{ U_i \overset{f_i}{\to} \mathbb{R}^n\}_{i \in I} is a differentiably good open cover of a Cartesian space (def. 3) then the function which restricts n\mathbb{R}^n-plots of XX to a set of U iU_i-plots

    X( n)((f i) *) iIiIX(U i) X(\mathbb{R}^n) \overset{( (f_i)^\ast )_{i \in I} }{\hookrightarrow} \underset{i \in I}{\prod} X(U_i)

    is a bijection onto the set of those tuples (Φ iX(U i)) iI(\Phi_i \in X(U_i))_{i \in I} of plots, which are “matching families” in that they agree on intersections:

    ϕ i| U iU j=ϕ j| U iU jAAAAAA U iU j U i U j Φ i Φ j X \phi_i\vert_{U_i \cap U_j} = \phi_j \vert_{U_i \cap U_j} \phantom{AAAAAA} \array{ && U_i \cap U_j \\ & \swarrow && \searrow \\ U_i && && U_j \\ & {}_{\mathrlap{\Phi_i}}\searrow && \swarrow_{\mathrlap{\Phi_j}} \\ && X }

Finally, given X 1X_1 and X 2X_2 two diffeological spaces, then a smooth function between them

f:X 1X 2 f \;\colon\; X_1 \longrightarrow X_2

is

  • a function of the underlying sets

    f s:(X 1) s(X 2) s f_s \;\colon\; (X_1)_s \longrightarrow (X_2)_s

such that

  • for ΦX( n)\Phi \in X(\mathbb{R}^n) a plot of X 1X_1, then the composition f sΦ sf_s \circ \Phi_s is a plot f *(Φ)f_\ast(\Phi) of X 2X_2:

    n Φ f *(Φ) X 1 f X 2. \array{ && \mathbb{R}^n \\ & {}^{\mathllap{\Phi}}\swarrow && \searrow^{\mathrlap{f_\ast(\Phi)}} \\ X_1 && \underset{f}{\longrightarrow} && X_2 } \,.

(Stated more abstractly, this says simply that diffeological spaces are the concrete sheaves on the site of Cartesian spaces from def. 3.)

For more background on diffeological spaces see also geometry of physics -- smooth sets.

Example

(Cartesian spaces are diffeological spaces)

Let XX be a Cartesian space (def. 1) Then it becomes a diffeological space (def. 35) by declaring its plots ΦX( n)\Phi \in X(\mathbb{R}^n) to the ordinary smooth functions Φ: nX\Phi \colon \mathbb{R}^n \to X.

Under this identification, a function f:(X 1) s(X 2) sf \;\colon\; (X_1)_s \to (X_2)_s between the underlying sets of two Cartesian spaces is a smooth function in the ordinary sense precisely if it is a smooth function in the sense of diffeological spaces.

Stated more abstractly, this statement is an example of the Yoneda embedding over a subcanonical site.

More generally, the same construction makes every smooth manifold a smooth set.

Example

(diffeological space of field histories)

Let EfbΣE \overset{fb}{\to} \Sigma be a smooth field bundle (def. 34). Then the set Γ Σ(E)\Gamma_\Sigma(E) of field histories/sections (def. 34) becomes a diffeological space (def. 35)

(22)Γ Σ(E)DiffeologicalSpaces \Gamma_\Sigma(E) \in DiffeologicalSpaces

by declaring that a smooth family Φ ()\Phi_{(-)} of field histories, parameterized over any Cartesian space UU is a smooth function out of the Cartesian product manifold of Σ\Sigma with UU

U×Σ Φ ()() E (u,x) Φ u(x) \array{ U \times \Sigma &\overset{\Phi_{(-)}(-)}{\longrightarrow}& E \\ (u,x) &\mapsto& \Phi_u(x) }

such that for each uUu \in U we have pΦ u()=id Σp \circ \Phi_{u}(-) = id_\Sigma, i.e.

E Φ ()() fb U×Σ pr 2 Σ. \array{ && E \\ & {}^{\mathllap{\Phi_{(-)}(-)}}\nearrow & \downarrow^{\mathrlap{fb}} \\ U \times \Sigma &\underset{pr_2}{\longrightarrow}& \Sigma } \,.

The following example 17 is included only for readers who wonder how infinite-dimensional manifolds fit in. Since we will never actually use infinite-dimensional manifold-structure, this example is may be ignored.

Example

(Fréchet manifolds are diffeological spaces)

Consider the particular type of infinite-dimensional manifolds called Fréchet manifolds. Since ordinary smooth manifolds UU are an example, for XX a Fréchet manifold there is a concept of smooth functions UXU \to X. Hence we may give XX the structure of a diffeological space (def. 35) by declaring the plots over UU to be these smooth functions UXU \to X, with the evident postcomposition action.

It turns out that then that for XX and YY two Fréchet manifolds, there is a natural bijection between the smooth functions XYX \to Y between them regarded as Fréchet manifolds and [regarded as diffeological spaces. Hence it does not matter which of the two perspectives we take (unless of course a diffeological space more general than a Fréchet manifolds enters the picture, at which point the second definition generalizes, whereas the first does not).]

Stated more abstractly, this means that Fréchet manifolds form a full subcategory of that of diffeological spaces (this prop.):

FrechetManifoldsDiffeologicalSpaces. FrechetManifolds \hookrightarrow DiffeologicalSpaces \,.

If Σ\Sigma is a compact smooth manifold and EΣ×FΣE \simeq \Sigma \times F \to \Sigma is a trivial fiber bundle with fiber FF a smooth manifold, then the set of sections Γ Σ(E)\Gamma_\Sigma(E) carries a standard structure of a Fréchet manifold (see at manifold structure of mapping spaces). Under the above inclusion of Fréchet manifolds into diffeological spaces, this smooth structure agrees with that from example 16 (see this prop.)

Once the step from smooth manifolds to diffeological spaces (def. 35) is made, characterizing the smooth structure of the space entirely by how we may probe it by mapping smooth Cartesian spaces into it, it becomes clear that the underlying set X sX_s of a diffeological space XX is not actually crucial to support the concept: The space is already entirely defined structurally by the system of smooth plots it has, and the underlying set X sX_s is recovered from these as the set of plots from the point 0\mathbb{R}^0.

This is crucial for field theory: the spaces of field histories of fermionic fields (def. 50 below) such as the Dirac field (example 53 below) do not have underlying sets of points the way diffeological spaces have. Informally, the reason is that a point is a bosonic object, while and the nature of fermionic fields is the opposite of bosonic.

But we may just as well drop the mentioning of the underlying set X sX_s in the definition of generalized smooth spaces. By simply stripping this requirement off of def. 35 we obtain the following more general and more useful definition (still “bosonic”, though, the supergeometric version is def. 48 below):

Definition

(smooth set)

A smooth set XX is

  1. for each nn \in \mathbb{N} a choice of set

    X( n)Set X(\mathbb{R}^n) \in Set

    to be called the set of smooth functions or plots from n\mathbb{R}^n to XX;

  2. for each smooth function f: n 1 n 2f \;\colon\; \mathbb{R}^{n_1} \longrightarrow \mathbb{R}^{n_2} between Cartesian spaces a choice of function

    f *:X( n 2)X( n 1) f^\ast \;\colon\; X(\mathbb{R}^{n_2}) \longrightarrow X(\mathbb{R}^{n_1})

    to be thought of as the precomposition operation

    ( n 2ΦX)f *( n 1f n 2ΦX) \left( \mathbb{R}^{n_2} \overset{\Phi}{\longrightarrow} X \right) \;\overset{f^\ast}{\mapsto}\; \left( \mathbb{R}^{n_1} \overset{f}{\to} \mathbb{R}^{n_2} \overset{\Phi}{\to} X \right)

such that

  1. (functoriality)

    1. If id n: n nid_{\mathbb{R}^n} \;\colon\; \mathbb{R}^n \to \mathbb{R}^n is the identity function on n\mathbb{R}^n, then (id n) *:X( n)X( n)\left(id_{\mathbb{R}^n}\right)^\ast \;\colon\; X(\mathbb{R}^n) \to X(\mathbb{R}^n) is the identity function on the set of plots X( n)X(\mathbb{R}^n).

    2. If n 1f n 2g n 3\mathbb{R}^{n_1} \overset{f}{\to} \mathbb{R}^{n_2} \overset{g}{\to} \mathbb{R}^{n_3} are two composable smooth functions between Cartesian spaces, then consecutive pullback of plots along them equals the pullback along the composition:

      f *g *=(gf) * f^\ast \circ g^\ast = (g \circ f)^\ast

      i.e.

      X( n 2) f * g * X( n 1) (gf) * X( n 3) \array{ && X(\mathbb{R}^{n_2}) \\ & {}^{\mathllap{f^\ast}}\swarrow && \nwarrow^{\mathrlap{g^\ast}} \\ X(\mathbb{R}^{n_1}) && \underset{ (g \circ f)^\ast }{\longleftarrow} && X(\mathbb{R}^{n_3}) }
  2. (gluing)

    If {U if i n} iI\{ U_i \overset{f_i}{\to} \mathbb{R}^n\}_{i \in I} is a differentiably good open cover of a Cartesian space (def. 3) then the function which restricts n\mathbb{R}^n-plots of XX to a set of U iU_i-plots

    X( n)((f i) *) iIiIX(U i) X(\mathbb{R}^n) \overset{( (f_i)^\ast )_{i \in I} }{\hookrightarrow} \underset{i \in I}{\prod} X(U_i)

    is a bijection onto the set of those tuples (Φ iX(U i)) iI(\Phi_i \in X(U_i))_{i \in I} of plots, which are “matching families” in that they agree on intersections:

    ϕ i| U iU j=ϕ j| U iU jAAAAi.e.AAAA U iU j U i U j Φ i Φ j X \phi_i\vert_{U_i \cap U_j} = \phi_j \vert_{U_i \cap U_j} \phantom{AAAA} \text{i.e.} \phantom{AAAA} \array{ && U_i \cap U_j \\ & \swarrow && \searrow \\ U_i && && U_j \\ & {}_{\mathrlap{\Phi_i}}\searrow && \swarrow_{\mathrlap{\Phi_j}} \\ && X }

Finally, given X 1X_1 and X 2X_2 two smooth sets, then a smooth function between them

f:X 1X 2 f \;\colon\; X_1 \longrightarrow X_2

is

  • for each nn \in \mathbb{N} a function

    f *( n):X 1( n)X 2( n) f_\ast(\mathbb{R}^n) \;\colon\; X_1(\mathbb{R}^n) \longrightarrow X_2(\mathbb{R}^n)

such that

  • for each smooth function g: n 1 n 2g \colon \mathbb{R}^{n_1} \to \mathbb{R}^{n_2} between Cartesian spaces we have

    g 2 *f *( n 2)=f *( n 1)g 1 *AAAAAi.e.AAAAAi.e.AAAAAX 1( n 2) f *( n 2) X 2( n 2) g 1 * g 2 * X 1( n 1) f *( n 1) X 2( n 1) g^\ast_2 \circ f_\ast(\mathbb{R}^{n_2}) = f_\ast(\mathbb{R}^{n_1}) \circ g^\ast_1 \phantom{AAAAA} \text{i.e.} \phantom{AAAAA} \text{i.e.} \phantom{AAAAA} \array{ X_1(\mathbb{R}^{n_2}) &\overset{f_\ast(\mathbb{R}^{n_2})}{\longrightarrow}& X_2(\mathbb{R}^{n_2}) \\ \mathllap{g_1^\ast}\downarrow && \downarrow\mathrlap{g^\ast_2} \\ X_1(\mathbb{R}^{n_1}) &\underset{f_\ast(\mathbb{R}^{n_1})}{\longrightarrow}& X_2(\mathbb{R}^{n_1}) }

Stated more abstractly, this simply says that smooth sets are the sheaves on the site of Cartesian spaces from def. 3.

Basing differential geometry on smooth sets is an instance of the general approach to geometry called functorial geometry or topos theory. For more background on this see at geometry of physics -- smooth sets.

First we verify that the concept of smooth sets is a consistent generalization:

Example

(diffeological spaces are smooth sets)

Every diffeological space XX (def. 35) is a smooth set (def. 36) simply by forgetting its underlying set of points and remembering only its sets of plot.

In particular therefore each Cartesian space n\mathbb{R}^n is canonically a smooth set by example 15.

Moreover, given any two diffeological spaces, then the morphisms f:XYf \colon X \to Y between them, regarded as diffeological spaces, are the same as the morphisms as smooth sets.

Stated more abstractly, this means that we have full subcategory inclusions

CartesianSpacesAAADiffeologicalSpacesAAASmoothSets. CartesianSpaces \overset{\phantom{AAA}}{\hookrightarrow} DiffeologicalSpaces \overset{\phantom{AAA}}{\hookrightarrow} SmoothSets \,.

Recall, for the next proposition 17, that in the definition 36 of a smooth set XX the sets X( n)X(\mathbb{R}^n) are abstract sets which are to be thought of as would-be smooth functions “ nX\mathbb{R}^n \to X”. Inside def. 36 this only makes sense in quotation marks, since inside that definition the smooth set XX is only being defined, so that inside that definition there is not yet an actual concept of smooth functions of the form “ nX\mathbb{R}^n \to X”.

But now that the definition of smooth sets and of morphisms between them has been stated, and seeing that Cartesian space n\mathbb{R}^n are examples of smooth sets, by example 18, there is now an actual concept of smooth functions nX\mathbb{R}^n \to X, namely as smooth sets. For the concept of smooth sets to be consistent, it ought to be true that this a posteriori concept of smooth functions from Cartesian spaces to smooth sets coincides wth the a priori concept, hence that we “may remove the quotation marks” in the above. The following proposition says that this is indeed the case:

Proposition

(plots of a smooth set really are the smooth functions into the smooth set)

Let XX be a smooth set (def. 36). For nn \in \mathbb{R}, there is a natural function

Hom SmoothSet( n,X)AAAAX( n) Hom_{SmoothSet}(\mathbb{R}^n , X) \overset{\phantom{AA}\simeq\phantom{AA}}{\longrightarrow} X(\mathbb{R}^n)

from the set of homomorphisms of smooth sets from n\mathbb{R}^n (regarded as a smooth set via example 18) to XX, to the set of plots of XX over n\mathbb{R}^n, given by evaluating on the identity plot id nid_{\mathbb{R}^n}.

This function is a bijection.

This says that the plots of XX, which initially bootstrap XX into being as declaring the would-be smooth functions into XX, end up being the actual smooth functions into XX.

Proof

This elementary but profound fact is called the Yoneda lemma, here in its incarnation over the site of Cartesian spaces (def. 1).

A key class of examples of smooth sets (def. 36) that are not diffeological spaces (def. 35) are universal smooth moduli spaces of differential forms:

Example

(universal smooth moduli spaces of differential forms)

For kk \in \mathbb{N} there is a smooth set (def. 36)

Ω kSmoothSet \mathbf{\Omega}^k \;\in\; SmoothSet

defined as follows:

  1. for nn \in \mathbb{N} the set of plots from n\mathbb{R}^n to Ω k\mathbf{\Omega}^k is the set of smooth differential k-forms on n\mathbb{R}^n (def. 11)

    Ω k( n)Ω k( n) \mathbf{\Omega}^k(\mathbb{R}^n) \;\coloneqq\; \Omega^k(\mathbb{R}^n)
  2. for f: n 1 n 2f \colon \mathbb{R}^{n_1} \to \mathbb{R}^{n_2} a smooth function (def. 1) the operation of fullback of plots along ff is just the pullback of differential forms f *f^\ast from prop. 2

    n 1 Ω k( n 1) f f * n 2 Ω k( n 2) \array{ \mathbb{R}^{n_1} && \Omega^k(\mathbb{R}^{n_1}) \\ \downarrow^{\mathrlap{f}} && \uparrow^{\mathrlap{f^\ast}} \\ \mathbb{R}^{n_2} && \Omega^k(\mathbb{R}^{n_2}) }

That this is functorial is just the standard fact (7) from prop. 2.

For k=1k = 1 the smooth set Ω 0\mathbf{\Omega}^0 actually is a diffeological space, in fact under the identification of example 18 this is just the real line:

Ω 0 1. \mathbf{\Omega}^0 \simeq \mathbb{R}^1 \,.

But for k1k \geq 1 we have that the set of plots on 0=*\mathbb{R}^0 = \ast is a singleton

Ω k1( 0){0} \mathbf{\Omega}^{k \geq 1}(\mathbb{R}^0) \simeq \{0\}

consisting just of the zero differential form. The only diffeological space with this property is 0=*\mathbb{R}^0 = \ast itself. But Ω k1\mathbf{\Omega}^{k \geq 1} is far from being that trivial: even though its would-be underlying set is a single point, for all nkn \geq k it admits an infinite set of plots. Therefore the smooth sets Ω k\mathbf{\Omega}^k for kk \geq are not diffeological spaces.

That the smooth set Ω k\mathbf{\Omega}^k indeed deserves to be addressed as the universal moduli space of differential k-forms follows from prop. 17: The universal moduli space of kk-forms ought to carry a universal differential kk-forms ω univΩ k(Ω k)\omega_{univ} \in \Omega^k(\mathbf{\Omega}^k) such that every differential kk-form ω\omega on any n\mathbb{R}^n arises as the pullback of differential forms of this universal one along some modulating morphism f ω:XΩ kf_\omega \colon X \to \mathbf{\Omega}^k:

{ω} (f ω) * {ω univ} X f ω Ω k \array{ \{\omega\} &\overset{(f_\omega)^\ast}{\longleftarrow}& \{\omega_{univ}\} \\ \\ X &\underset{f_\omega}{\longrightarrow}& \mathbf{\Omega}^k }

But with prop. 17 this is precisely what the definition of the plots of Ω k\mathbf{\Omega}^k says.

Similarly, all the usual operations on differential form now have their universal archetype on the universal moduli spaces of differential forms

In particular, for kk \in \mathbb{N} there is a canonical morphism of smooth sets of the form

Ω kdΩ k+1 \mathbf{\Omega}^k \overset{\mathbf{d}}{\longrightarrow} \mathbf{\Omega}^{k+1}

defined over n\mathbb{R}^n by the ordinary de Rham differential (def. 12)

(23)Ω k( n)dΩ k+1( n). \Omega^k(\mathbb{R}^n) \overset{d}{\longrightarrow} \Omega^{k+1}(\mathbb{R}^n) \,.

That this satisfies the compatibility with precomposition of plots

n 1 Ω k( n 1) d Ω k+1( n 1) f f * f * n 2 Ω k( n 2) d Ω k( n 2) \array{ \mathbb{R}^{n_1} && \Omega^k(\mathbb{R}^{n_1}) &\overset{d}{\longrightarrow}& \Omega^{k+1}(\mathbb{R}^{n_1}) \\ {}^{\mathllap{f}}\downarrow && \uparrow^{\mathrlap{f^\ast}} && \uparrow^{\mathrlap{f^\ast}} \\ \mathbb{R}^{n_2} && \Omega^k(\mathbb{R}^{n_2}) &\underset{d}{\longrightarrow}& \Omega^k( \mathbb{R}^{n_2} ) }

is just the compatibility of pullback of differential forms with the de Rham differential of from prop. 2.

The upshot is that we now have a good definition of differential forms on any diffeological space and more generally on any smooth set:

Definition

(differential forms on smooth sets)

Let XX be a diffeological space (def. 35) or more generally a smooth set (def. 36) then a differential k-form ω\omega on XX is equivalently a morphism of smooth sets

XΩ k X \longrightarrow \mathbf{\Omega}^k

from XX to the universal smooth moduli space of differential froms from example 19.

Concretely, by unwinding the definitions of Ω k\mathbf{\Omega}^k and of morphisms of smooth sets, this means that such a differential form is:

  • for each nn \in \mathbb{N} and each plot nΦX\mathbb{R}^n \overset{\Phi}{\to} X an ordinary differential form

    Φ *(ω)Ω ( n) \Phi^\ast(\omega) \in \Omega^\bullet(\mathbb{R}^n)

such that

  • for each smooth function f: n 1 n 2f \;\colon\; \mathbb{R}^{n_1} \to \mathbb{R}^{n_2} between Cartesian spaces the ordinary pullback of differential forms along ff is compatible with these choices, in that for every plot n 2ΦX\mathbb{R}^{n_2} \overset{\Phi}{\to} X we have

    f *(Φ *(ω))=(f *Φ) *(ω) f^\ast\left(\Phi^\ast(\omega)\right) = ( f^\ast \Phi )^\ast(\omega)

    i.e.

    n 1 f n 2 f *Φ Φ XAAAAΩ ( n 1) f * Ω ( n 2) (f *Φ) * Φ * Ω (X). \array{ \mathbb{R}^{n_1} && \overset{f}{\longrightarrow} && \mathbb{R}^{n_2} \\ & {}_{\mathllap{f^\ast \Phi}}\searrow && \swarrow_{\mathrlap{\Phi}} \\ && X } \phantom{AAAA} \array{ \Omega^\bullet( \mathbb{R}^{n_1} ) && \overset{f^\ast}{\longleftarrow} && \Omega^\bullet(\mathbb{R}^{n_2}) \\ & {}_{\mathllap{(f^\ast \Phi)^\ast}}\nwarrow && \nearrow_{\mathrlap{\Phi^\ast}} \\ && \Omega^\bullet(X) } \,.

We write Ω (X)\Omega^\bullet(X) for the set of differential forms on the smooth set XX defined this way.

Moreover, given a differential k-form

XωΩ k X \overset{\omega}{\longrightarrow} \mathbf{\Omega}^k

on a smooth set XX this way, then its de Rham differential dωΩ k+1(X)d \omega \in \Omega^{k+1}(X) is given by the composite of morphisms of smooth sets with the universal de Rham differential from (23):

(24)dω:XωΩ kdΩ k+1. d \omega \;\colon\; X \overset{\omega}{\longrightarrow} \mathbf{\Omega}^k \overset{d}{\longrightarrow} \mathbf{\Omega}^{k+1} \,.

Explicitly this means simply that for Φ:UX\Phi \colon U \to X a plot, then

Φ *(dω)=d(Φ *ω)Ω k+1(U). \Phi^\ast (d\omega) \;=\; d\left( \Phi^\ast \omega\right) \;\in\; \Omega^{k+1}(U) \,.

The usual operations on ordinary differential forms directly generalize plot-wise to differential forms on diffeological spaces and more generally on smooth sets:

Definition

(exterior differential and exterior product on smooth sets)

Let XX be a diffeological space (def. 35) or more generally a smooth set (def. 36). Then

  1. For ωΩ n(X)\omega \in \Omega^n(X) a differential form on XX (def. 37) its exterior differential

    dωΩ n+1(X) d \omega \in \Omega^{n+1}(X)

    is defined on any plot nΦX\mathbb{R}^n \overset{\Phi}{\to} X as the ordinary exterior differential of the pullback of ω\omega along that plot:

    Φ *(dω)dΦ *(ω). \Phi^\ast(d \omega) \coloneqq d \Phi^\ast(\omega) \,.
  2. For ω 1Ω n 1\omega_1 \in \Omega^{n_1} and ω 2Ω n 2(X)\omega_2 \in \Omega^{n_2}(X) two differential forms on XX (def. 37) then their exterior product

    ω 1ω 2Ω n 1+n 2(X) \omega_1 \wedge \omega_2 \;\in\; \Omega^{n_1 + n_2}(X)

    is the differential form defined on any plot nΦX\mathbb{R}^n \overset{\Phi}{\to} X as the ordinary exterior product of the pullback of th differential forms ω 1\omega_1 and ω 2\omega_2 to this plot:

    Φ *(ω 1ω 2)Φ *(ω 1)Φ *(ω 2). \Phi^\ast(\omega_1 \wedge \omega_2) \;\coloneqq\; \Phi^\ast(\omega_1) \wedge \Phi^\ast(\omega_2) \,.

\,

Infinitesimal geometry

It is crucial in field theory that we consider field histories not only over all of spacetime, but also restricted to submanifolds of spacetime. Or rather, what is actually of interest are the restrictions of the field histories to the infinitesimal neighbourhoods (example 27 below) of these submanifolds. This appears notably in the construction of phase spaces below. Moreover, fermion fields such as the Dirac field (example 35 below) take values in graded infinitesimal spaces, called super spaces (discussed below). Therefore “infinitesimal geometry”, sometimes called formal geometry (as in “formal scheme”) or synthetic differential geometry or synthetic differential supergeometry, is a central aspect of field theory.

In order to mathematically grasp what infinitesimal neighbourhoods are, we appeal to the first magic algebraic property of differential geometry from prop. 1, which says that we may recognize smooth manifolds XX dually in terms of their commutative algebras C (X)C^\infty(X) of smooth functions on them

C ():SmoothManifoldsAAA(Algebras) op. C^\infty(-) \;\colon\; SmoothManifolds \overset{\phantom{AAA}}{\hookrightarrow} (\mathbb{R} Algebras)^{op} \,.

But since there are of course more algebras AAlgebrasA \in \mathbb{R}Algebras than arise this way from smooth manifolds, we may turn this around and try to regard any algebra AA as defining a would-be space, which would have AA as its algebra of functions.

For example an infinitesimally thickened point should be a space which is “so small” that every smooth function ff on it which vanishes at the origin takes values so tiny that some finite power of them is not just even more tiny, but actually vanishes:

Definition

(infinitesimally thickened Cartesian space)

An infinitesimally thickened point

𝔻Spec(A) \mathbb{D} \coloneqq Spec(A)

is represented by a commutative algebra AAlgA \in \mathbb{R}Alg which as a real vector space is a direct sum

A 1V A \simeq_{\mathbb{R}} \langle 1 \rangle \oplus V

of the 1-dimensional space 1=\langle 1 \rangle = \mathbb{R} of multiples of 1 with a finite dimensional vector space VV that is a nilpotent ideal in that for each element aVa \in V there exists a natural number nn \in \mathbb{N} such that

a n+1=0. a^{n+1} = 0 \,.

More generally, an infinitesimally thickened Cartesian space

n×𝔻 n×Spec(A) \mathbb{R}^n \times \mathbb{D} \;\coloneqq\; \mathbb{R}^n \times Spec(A)

is represented by a commutative algebra

C ( n)AAlg C^\infty(\mathbb{R}^n) \otimes A \;\in\; \mathbb{R} Alg

which is the tensor product of algebras of the algebra of smooth functions C ( n)C^\infty(\mathbb{R}^n) on an actual Cartesian space of some dimension nn (example 2), with an algebra of functions A 1VA \simeq_{\mathbb{R}} \langle 1\rangle \oplus V of an infinitesimally thickened point, as above.

We say that a smooth function between two infinitesimally thickened Cartesian spaces

n 1×Spec(A 1)f n 2×Spec(A 2) \mathbb{R}^{n_1} \times Spec(A_1) \overset{f}{\longrightarrow} \mathbb{R}^{n_2} \times Spec(A_2)

is by definition dually an \mathbb{R}-algebra homomorphism of the form

C ( n 1)A 1f *C ( n 2)A 2. C^\infty(\mathbb{R}^{n_1}) \otimes A_1 \overset{f^\ast}{\longleftarrow} C^\infty(\mathbb{R}^{n_2}) \otimes A_2 \,.
Example

(infinitesimal neighbourhoods in the real line )

Consider the quotient algebra of the formal power series algebra [[ϵ]]\mathbb{R}[ [\epsilon] ] in a single parameter ϵ\epsilon by the ideal generated by ϵ 2\epsilon^2:

([[ϵ]])/(ϵ 2) ϵ. (\mathbb{R}[ [\epsilon] ])/(\epsilon^2) \;\simeq_{\mathbb{R}}\; \mathbb{R} \oplus \epsilon \mathbb{R} \,.

(This is sometimes called the algebra of dual numbers, for no good reason.) The underlying real vector space of this algebra is, as show, the direct sum of the multiples of 1 with the multiples of ϵ\epsilon. A general element in this algebra is of the form

a+bϵ([ϵ])/(ϵ 2) a + b \epsilon \in (\mathbb{R}[\epsilon])/(\epsilon^2)

where a,ba,b \in \mathbb{R} are real numbers. The product in this algebra is given by “multiplying out” as usual, and discarding all terms proportional to ϵ 2\epsilon^2:

(a 1+b 1ϵ)(a 2+b 2ϵ)=a 1a 2+(a 1b 2+b 1a 2)ϵ. \left( a_1 + b_1 \epsilon \right) \cdot \left( a_2 + b_2 \epsilon \right) \;=\; a_1 a_2 + ( a_1 b_2 + b_1 a_2 ) \epsilon \,.

We may think of an element a+bϵa + b \epsilon as the truncation to first order of a Taylor series at the origin of a smooth function on the real line

f: f \;\colon\; \mathbb{R} \to \mathbb{R}

where a=f(0)a = f(0) is the value of the function at the origin, and where b=fx(0)b = \frac{\partial f}{\partial x}(0) is its first derivative at the origin.

Therefore this algebra behaves like the algebra of smooth function on an infinitesimal neighbourhood 𝔻 1\mathbb{D}^1 of 00 \in \mathbb{R} which is so tiny that its elements ϵ𝔻 1\epsilon \in \mathbb{D}^1 \hookrightarrow \mathbb{R} become, upon squaring them, not just tinier, but actually zero:

ϵ 2=0. \epsilon^2 = 0 \,.

This intuitive picture is now made precise by the concept of infinitesimally thickened points def. 39, if we simply set

𝔻 1Spec([[ϵ]]/(ϵ 2)) \mathbb{D}^1 \;\coloneqq\; Spec\left( \mathbb{R}[ [\epsilon] ]/(\epsilon^2) \right)

and observe that there is the inclusion of infinitesimally thickened Cartesian spaces

𝔻 1AAiAA 1 \mathbb{D}^1 \overset{\phantom{AA}i\phantom{AA} }{\hookrightarrow} \mathbb{R}^1

which is dually given by the algebra homomorphism

ϵ i * C ( 1) f(0)+fx(0) {f} \array{ \mathbb{R} \oplus \epsilon \mathbb{R} &\overset{i^\ast}{\longleftarrow}& C^\infty(\mathbb{R}^1) \\ f(0) + \frac{\partial f}{\partial x}(0) &\longleftarrow& \{f\} }

which sends a smooth function to its value f(0)f(0) at zero plus ϵ\epsilon times its derivative at zero. Observe that this is indeed a homomorphism of algebras due to the product law of differentiation, which says that

i *(fg) =(fg)(0)+fgx(0)ϵ =f(0)g(0)+(fx(0)g(0)+f(0)gx(0))ϵ =(f(0)+fx(0)ϵ)(g(0)+gx(0)ϵ) \begin{aligned} i^\ast(f \cdot g) & = (f \cdot g)(0) + \frac{\partial f \cdot g}{\partial x}(0) \epsilon \\ & = f(0) \cdot g(0) + \left( \frac{\partial f}{\partial x}(0) \cdot g(0) + f(0) \cdot \frac{\partial g}{\partial x}(0) \right) \epsilon \\ & = \left( f(0) + \frac{\partial f}{\partial x}(0) \epsilon \right) \cdot \left( g(0) + \frac{\partial g}{\partial x}(0) \epsilon \right) \end{aligned}

Hence we see that restricting a smooth function to the infinitesimal neighbourhood of a point is equivalent to restricting attention to its [[Taylor series|] to the given order at that point:

𝔻 1 i 1 (ϵf(0)+fx(0)ϵ) f 1 \array{ \mathbb{D}^1 &\overset{i}{\hookrightarrow}& \mathbb{R}^1 \\ & {}_{\mathllap{(\epsilon \mapsto f(0) + \frac{\partial f}{\partial x}(0) \epsilon) }}\searrow & \downarrow_{\mathrlap{f}} \\ && \mathbb{R}^1 }

Similarly for each kk \in \mathbb{N} the algebra

([[ϵ]])/(ϵ k+1) (\mathbb{R}[ [ \epsilon ] ])/(\epsilon^{k+1})

may be thought of as the algebra of Taylor series at the origin of \mathbb{R} of smooth functions \mathbb{R} \to \mathbb{R}, where all terms of order higher than kk are discarded. The corresponding infinitesimally thickened point is often denoted

𝔻 1(k)Spec(([[ϵ]])/(ϵ k+1)). \mathbb{D}^1(k) \;\coloneqq\; Spec\left( \left(\mathbb{R}[ [\epsilon] ]\right)/(\epsilon^{k+1}) \right) \,.

This is now the subobject of the real line

𝔻 1(k)AAA 1 \mathbb{D}^1(k) \overset{\phantom{AAA}}{\hookrightarrow} \mathbb{R}^1

on those elements ϵ\epsilon such that ϵ k+1=0\epsilon^{k+1} = 0.

(Kock 81, Kock 10)

The following example 21 shows that infinitesimal thickening is invisible for ordinary spaces when mapping out of these. In contrast example 22 further below shows that the morphisms into an ordinary space out of an infinitesimal space are interesting: these are tangent vectors and their higher order infinitesimal analogs.

Example

(infinitesimal line 𝔻 1\mathbb{D}^1 has unique global point)

For n\mathbb{R}^n any ordinary Cartesian space (def. 1) and D 1(k) 1D^1(k) \hookrightarrow \mathbb{R}^1 the order-kk infinitesimal neighbourhood of the origin in the real line from example 20, there is exactly only one possible morphism of infinitesimally thickened Cartesian spaces from n\mathbb{R}^n to 𝔻 1(k)\mathbb{D}^1(k):

n ! 6𝔻 1(k) ! ! 0=*. \array{ \mathbb{R}^n && \overset{\exists !}{\longrightarrow} &6 \mathbb{D}^1(k) \\ & {}_{\mathllap{\exists !}}\searrow && \nearrow_{\mathrlap{\exists !}} \\ && \mathbb{R}^0 = \ast } \,.
Proof

By definition such a morphism is dually an algebra homomorphism

C ( n)f *([[ϵ]])/(ϵ k+1) 𝒪(ϵ) C^\infty(\mathbb{R}^n) \overset{f^\ast}{\longleftarrow} \left( \mathbb{R}[ [\epsilon] ])/(\epsilon^{k+1} \right) \simeq_{\mathbb{R}} \mathbb{R} \oplus \mathcal{O}(\epsilon)

from the higher order “algebra of dual numbers” to the algebra of smooth functions (example 2).

Now this being an \mathbb{R}-algebra homomorphism, its action on the multiples cc \in \mathbb{R} of the identity is fixed:

f *(1)=1. f^\ast(1) = 1 \,.

All the remaining elements are proportional to ϵ\epsilon, and hence are nilpotent. However, by the homomorphism property of an algebra homomorphism it follows that it must send nilpotent elements ϵ\epsilon to nilpotent elements f(ϵ)f(\epsilon), because

(f *(ϵ)) k+1 =f *(ϵ k+1) =f *(0) =0 \begin{aligned} \left(f^\ast(\epsilon)\right)^{k+1} & = f^\ast\left( \epsilon^{k+1}\right) \\ & = f^\ast(0) \\ & = 0 \end{aligned}

But the only nilpotent element in C ( n)C^\infty(\mathbb{R}^n) is the zero element, and hence it follows that

f *(ϵ)=0. f^\ast(\epsilon) = 0 \,.

Thus f *f^\ast as above is uniquely fixed.

Example

(synthetic tangent vector fields)

Let n\mathbb{R}^n be a Cartesian space (def. 1), regarded as an infinitesimally thickened Cartesian space (def. 39) and consider 𝔻 1Spec(([[ϵ]])/(ϵ 2))\mathbb{D}^1 \coloneqq Spec( (\mathbb{R}[ [\epsilon] ])/(\epsilon^2) ) the first order infinitesimal line from example 20.

Then homomorphisms of infinitesimally thickened Cartesian spaces of the form

n×𝔻 1 v˜ n pr 1 id n \array{ \mathbb{R}^n \times \mathbb{D}^1 && \overset{\tilde v}{\longrightarrow} && \mathbb{R}^n \\ & {}_{\mathllap{pr_1}}\searrow && \swarrow_{\mathrlap{id}} \\ && \mathbb{R}^n }

hence smoothly XX-parameterized collections of morphisms

v˜ x:𝔻 1 n \tilde v_x \;\colon\; \mathbb{D}^1 \longrightarrow \mathbb{R}^n

which send the unique base point (𝔻 1)=*\Re(\mathbb{D}^1) = \ast (example 21) to x nx \in \mathbb{R}^n, are in natural bijection with tangent vector fields vΓ n(T n)v \in \Gamma_{\mathbb{R}^n}(T \mathbb{R}^n) (example 5).

Proof

By definition, the morphisms in question are dually \mathbb{R}-algebra homomorphisms of the form

(C ( n)ϵC ( n))C ( n) (C^\infty(\mathbb{R}^n) \oplus \epsilon C^\infty(\mathbb{R}^n)) \longleftarrow C^\infty(\mathbb{R}^n)

which are the identity modulo ϵ\epsilon. Such a morphism has to take any function fC ( n)f \in C^\infty(\mathbb{R}^n) to

f+(f)ϵ f + (\partial f) \epsilon

for some smooth function (f)C ( n)(\partial f) \in C^\infty(\mathbb{R}^n). The condition that this assignment makes an algebra homomorphism is equivalent to the statement that for all f 1,f 2C ( n)f_1,f_2 \in C^\infty(\mathbb{R}^n) we have

(f 1f 2+((f 1f 2))ϵ)=(f 1+(f 1)ϵ)(f 2+(f 2)ϵ). (f_1 f_2 + (\partial (f_1 f_2))\epsilon ) \;=\; (f_1 + (\partial f_1) \epsilon) \cdot (f_2 + (\partial f_2) \epsilon) \,.

Multiplying this out and using that ϵ 2=0\epsilon^2 = 0, this is equivalent to

(f 1f 2)=(f 1)f 2+f 1(f 2). \partial(f_1 f_2) = (\partial f_1) f_2 + f_1 (\partial f_2) \,.

This in turn means equivalently that :C ( n)C ( n)\partial\colon C^\infty(\mathbb{R}^n)\to C^\infty(\mathbb{R}^n) is a derivation.

With this the statement follows with the third magic algebraic property of smooth functions (prop. 1): derivations of smooth functions are vector fields.

We need to consider infinitesimally thickened spaces more general than the thickenings of just Cartesian spaces in def. 39. But just as Cartesian spaces (def. 1) serve as the local test geometries to induce the general concept of diffeological spaces and smooth sets (def. 36), so using infinitesimally thickened Cartesian spaces as test geometries immediately induces the corresponding generalization of smooth sets with infinitesimals:

Definition

(formal smooth set)

A formal smooth set XX is

  1. for each infinitesimally thickened Cartesian space n×Spec(A)\mathbb{R}^n \times Spec(A) (def. 39) a set

    X( n×Spec(A))Set X(\mathbb{R}^n \times Spec(A)) \in Set

    to be called the set of smooth functions or plots from n×Spec(A)\mathbb{R}^n \times Spec(A) to XX;

  2. for each smooth function f: n 1×Spec(A 1) n 2×Spec(A 2)f \;\colon\; \mathbb{R}^{n_1} \times Spec(A_1) \longrightarrow \mathbb{R}^{n_2} \times Spec(A_2) between infinitesimally thickened Cartesian spaces a choice of function

    f *:X( n 2×Spec(A 2))X( n 1×Spec(A 1)) f^\ast \;\colon\; X(\mathbb{R}^{n_2} \times Spec(A_2)) \longrightarrow X(\mathbb{R}^{n_1} \times Spec(A_1))

    to be thought of as the precomposition operation

    ( n 2ΦX)f *( n 1×Spec(A 1)f n 2×Spec(A 2)ΦX) \left( \mathbb{R}^{n_2} \overset{\Phi}{\longrightarrow} X \right) \;\overset{f^\ast}{\mapsto}\; \left( \mathbb{R}^{n_1}\times Spec(A_1) \overset{f}{\to} \mathbb{R}^{n_2} \times Spec(A_2) \overset{\Phi}{\to} X \right)

such that

  1. (functoriality)

    1. If id n×Spec(A): n×Spec(A) n×Spec(A)id_{\mathbb{R}^n \times Spec(A)} \;\colon\; \mathbb{R}^n \times Spec(A) \to \mathbb{R}^n \times Spec(A) is the identity function on n×Spec(A)\mathbb{R}^n \times Spec(A), then (id n×Spec(A)) *:X( n×Spec(A))X( n×Spec(A))\left(id_{\mathbb{R}^n \times Spec(A)}\right)^\ast \;\colon\; X(\mathbb{R}^n \times Spec(A)) \to X(\mathbb{R}^n \times Spec(A)) is the identity function on the set of plots X( n×Spec(A))X(\mathbb{R}^n \times Spec(A));

    2. If n 1×Spec(A 1)f n 2×Spec(A 2)g n 3×Spec(A 3)\mathbb{R}^{n_1}\times Spec(A_1) \overset{f}{\to} \mathbb{R}^{n_2} \times Spec(A_2) \overset{g}{\to} \mathbb{R}^{n_3} \times Spec(A_3) are two composable smooth functions between infinitesimally thickened Cartesian spaces, then pullback of plots along them consecutively equals the pullback along the composition:

      f *g *=(gf) * f^\ast \circ g^\ast = (g \circ f)^\ast

      i.e.

      X( n 2×Spec(A 2)) f * g * X( n 1×Spec(A 1)) (gf) * X( n 3×Spec(A 3)) \array{ && X(\mathbb{R}^{n_2} \times Spec(A_2)) \\ & {}^{\mathllap{f^\ast}}\swarrow && \nwarrow^{\mathrlap{g^\ast}} \\ X(\mathbb{R}^{n_1} \times Spec(A_1)) && \underset{ (g \circ f)^\ast }{\longleftarrow} && X(\mathbb{R}^{n_3} \times Spec(A_3)) }
  2. (gluing)

    If {U i×Spec(A)f i×id Spec(A) n×Spec(A)} iI\{ U_i \times Spec(A) \overset{f_i \times id_{Spec(A)}}{\to} \mathbb{R}^n \times Spec(A)\}_{i \in I} is such that

    {U if i n} iI\{ U_i \overset{f_i }{\to} \mathbb{R}^n \}_{i \in I}

    in a differentiably good open cover (def. 3) then the function which restricts n×Spec(A)\mathbb{R}^n \times Spec(A)-plots of XX to a set of U i×Spec(A)U_i \times Spec(A)-plots

    X( n×Spec(A))((f i) *) iIiIX(U i×Spec(A)) X(\mathbb{R}^n \times Spec(A)) \overset{( (f_i)^\ast )_{i \in I} }{\hookrightarrow} \underset{i \in I}{\prod} X(U_i \times Spec(A))

    is a bijection onto the set of those tuples (Φ iX(U i)) iI(\Phi_i \in X(U_i))_{i \in I} of plots, which are “matching families” in that they agree on intersections:

    ϕ i| ((U iU j)×Spec(A)=ϕ j| (U iU j)×Spec(A) \phi_i\vert_{((U_i \cap U_j) \times Spec(A)} = \phi_j \vert_{(U_i \cap U_j)\times Spec(A)}

    i.e.

    (U iU j)×Spec(A) U i×Spec(A) U j×Spec(A) Φ i Φ j X \array{ && (U_i \cap U_j) \times Spec(A) \\ & \swarrow && \searrow \\ U_i\times Spec(A) && && U_j \times Spec(A) \\ & {}_{\mathrlap{\Phi_i}}\searrow && \swarrow_{\mathrlap{\Phi_j}} \\ && X }

Finally, given X 1X_1 and X 2X_2 two formal smooth sets, then a smooth function between them

f:X 1X 2 f \;\colon\; X_1 \longrightarrow X_2

is

  • for each infinitesimally thickened Cartesian space n×Spec(A)\mathbb{R}^n \times Spec(A) (def. 39) a function

    f *( n×Spec(A)):X 1( n×Spec(A))X 2( n×Spec(A)) f_\ast(\mathbb{R}^n \times Spec(A)) \;\colon\; X_1(\mathbb{R}^n \times Spec(A)) \longrightarrow X_2(\mathbb{R}^n \times Spec(A))

such that

  • for each smooth function g: n 1×Spec(A 1) n 2×Spec(A 2)g \colon \mathbb{R}^{n_1} \times Spec(A_1) \to \mathbb{R}^{n_2} \times Spec(A_2) between infinitesimally thickened Cartesian spaces we have

    g 2 *f *( n 2×Spec(A 2))=f *( n 1×Spec(A 1))g 1 * g^\ast_2 \circ f_\ast(\mathbb{R}^{n_2} \times Spec(A_2)) = f_\ast(\mathbb{R}^{n_1} \times Spec(A_1)) \circ g^\ast_1

    i.e.

    X 1( n 2×Spec(A 2)) f *( n 2×Spec(A 2)) X 2( n 2×Spec(A 2)) g 1 * g 2 * X 1( n 1×Spec(A 1)) f *( n 1) X 2( n 1×Spec(A 1)) \array{ X_1(\mathbb{R}^{n_2} \times Spec(A_2)) &\overset{f_\ast(\mathbb{R}^{n_2}\times Spec(A_2) )}{\longrightarrow}& X_2(\mathbb{R}^{n_2} \times Spec(A_2)) \\ \mathllap{g_1^\ast}\downarrow && \downarrow\mathrlap{g^\ast_2} \\ X_1(\mathbb{R}^{n_1} \times Spec(A_1)) &\underset{f_\ast(\mathbb{R}^{n_1})}{\longrightarrow}& X_2(\mathbb{R}^{n_1} \times Spec(A_1)) }

(Dubuc 79)

Basing infinitesimal geometry on formal smooth sets is an instance of the general approach to geometry called functorial geometry or topos theory. For more background on this see at geometry of physics -- manifolds and orbifolds.

We have the evident generalization of example 15 to smooth geometry with infinitesimals:

Example

(infinitesimally thickened Cartesian spaces are formal smooth sets)

For XX an infinitesimally thickened Cartesian space (def. 39), it becomes a formal smooth set according to def. 40 by taking its plots out of some n×𝔻\mathbb{R}^n \times \mathbb{D} to be the homomorphism of infinitesimally thickened Cartesian spaces:

X( n×𝔻)Hom FormalCartSp( n×𝔻,X). X(\mathbb{R}^n \times \mathbb{D}) \;\coloneqq\; Hom_{FormalCartSp}( \mathbb{R}^n \times \mathbb{D}, X ) \,.

(Stated more abstractly, this is an instance of the Yoneda embedding over a subcanonical site.)

Example

(smooth sets are formal smooth sets)

Let XX be a smooth set (def. 36). Then XX becomes a formal smooth set (def. 40) by declaring the set of plots X( n×𝔻)X(\mathbb{R}^n \times \mathbb{D}) over an infinitesimally thickened Cartesian space (def. 39) to be equivalence classes of pairs

n×𝔻 k,AA kX \mathbb{R}^n \times \mathbb{D} \longrightarrow \mathbb{R}^{k} \,, \phantom{AA} \mathbb{R}^k \longrightarrow X

of a morphism of infinitesimally thickened Cartesian spaces and of a plot of XX, as shown, subject to the equivalence relation which identifies two such pairs if there exists a smooth function f: k kf \colon \mathbb{R}^k \to \mathbb{R}^{k'} such that

n×𝔻 k f k k f k X \array{ && \mathbb{R}^n \times \mathbb{D} \\ & \swarrow && \searrow \\ \mathbb{R}^k && \overset{f}{\longrightarrow} && \mathbb{R}^{k'} \\ \mathbb{R}^k && \underset{f}{\longrightarrow} && \mathbb{R}^{k'} \\ & \searrow && \swarrow \\ && X }

Stated more abstractly this says that XX as a formal smooth set is the left Kan extension (see this example) of XX as a smooth set along the functor that includes Cartesian spaces (def. 1) into infinitesimally thickened Cartesian spaces (def. 39).

Definition

(reduction and infinitesimal shape)

For n×𝔻\mathbb{R}^n \times \mathbb{D} an infinitesimally thickened Cartesian space (def. 39) we say that the underlying ordinary Cartesian space n\mathbb{R}^n (def. 1) is its reduction

( n×𝔻) n. \Re\left( \mathbb{R}^n \times \mathbb{D} \right) \;\coloneqq\; \mathbb{R}^n \,.

There is the canonical inclusion morphism

( n×𝔻)= nAAAA n×𝔻 \Re\left( \mathbb{R}^n \times \mathbb{D} \right) = \mathbb{R}^n \overset{\phantom{AAAA}}{\hookrightarrow} \mathbb{R}^n \times \mathbb{D}

which dually corresponds to the homomorphism of commutative algebras

C ( n)C ( n) A C^\infty(\mathbb{R}^n) \longleftarrow C^\infty(\mathbb{R}^n) \otimes_{\mathbb{R}} A

which is the identity on all smooth functions fC ( n)f \in C^\infty(\mathbb{R}^n) and is zero on all elements aVAa \in V \subset A in the nilpotent ideal of AA (as in example 21).

Given any formal smooth set XX, we say that its infinitesimal shape or de Rham shape (also: de Rham stack) is the formal smooth set X\Im X (def. 40) defined to have as plots the reductions of the plots of XX, according to the above:

(X)(U)X((U)). (\Im X)( U ) \;\coloneqq\: X(\Re(U)) \,.

There is a canonical morphism of formal smooth set

η X:XX \eta_X \;\colon\; X \longrightarrow \Im X

which takes a plot

U= n×𝔻fX U = \mathbb{R}^n \times \mathbb{D} \overset{f}{\longrightarrow} X

to the composition

n n×𝔻fX \mathbb{R}^n \hookrightarrow \mathbb{R}^n \times \mathbb{D} \overset{f}{\hookrightarrow} X

regarded as a plot of X\Im X.

Example

(mapping space out of an infinitesimally thickened Cartesian space)

Let XX be an infinitesimally thickened Cartesian space (def. 39) and let YY be a formal smooth set (def. 40). Then the mapping space

[X,Y]FormalSmoothSet [X,Y] \;\in\; FormalSmoothSet

of smooth functions from XX to YY is the formal smooth set whose UU-plots are the morphisms of formal smooth sets from the Cartesian product of infinitesimally thickened Cartesian spaces U×XU \times X to YY, hence the U×XU \times X-plots of YY:

[X,Y](U)Y(U×X). [X,Y](U) \;\coloneqq\; Y(U \times X) \,.
Example

(synthetic tangent bundle)

Let X nX \coloneqq \mathbb{R}^n be a Cartesian space (def. 1) regarded as an infinitesimally thickened Cartesian space (39) and thus regarded as a formal smooth set (def. 40) by example 23. Consider the infinitesimal line

𝔻 1 1 \mathbb{D}^1 \hookrightarrow \mathbb{R}^1

from example 20. Then the mapping space [𝔻 1,X][\mathbb{D}^1, X] (example 25) is the total space of the tangent bundle TXT X (example 5). Moreover, under restriction along the reduction *𝔻 1\ast \longrightarrow \mathbb{D}^1, this is the full tangent bundle projection, in that there is a natural isomorphism of formal smooth sets of the form

TX [𝔻 1,X] tb [*𝔻 1,X] X [*,X] \array{ T X &\simeq& [\mathbb{D}^1, X] \\ {}^{\mathllap{tb}}\downarrow && \downarrow^{\mathrlap{ [ \ast \to \mathbb{D}^1, X ] }} \\ X &\simeq& [\ast, X] }

In particular this implies immediately that smooth sections (def. 5) of the tangent bundle

[𝔻 1,X] TX v X = X \array{ && [\mathbb{D}^1, X] & \simeq T X \\ & {}^{\mathllap{v}}\nearrow & \downarrow \\ X &=& X }

are equivalently morphisms of the form

X v˜ id X×𝔻 1 pr 1 X \array{ && X \\ & {}^{\mathllap{\tilde v}}\nearrow & \downarrow^{\mathrlap{id}} \\ X \times \mathbb{D}^1 &\underset{pr_1}{\longrightarrow}& X }

which we had already identified with tangent vector fields (def. 5) in example 22.

Proof

This follows by an analogous argument as in example 22, using the Hadamard lemma.

While in infinitesimally thickened Cartesian spaces (def. 39) only infinitesimals to any finite order may exist, in formal smooth sets (def. 40) we may find infinitesimals to any arbitrary finite order:

Example

(infinitesimal neighbourhood)

Let XX be a formal smooth sets (def. 40) YXY \hookrightarrow X a sub-formal smooth set. Then the infinitesimal neighbourhood to arbitrary infinitesimal order of YY in XX is the formal smooth set N XYN_X Y whose plots are those plots of XX

n×Spec(A)fX \mathbb{R}^n \times Spec(A) \overset{f}{\longrightarrow} X

such that their reduction (def. 41)

n n×Spec(A)fX \mathbb{R}^n \hookrightarrow \mathbb{R}^n \times Spec(A) \overset{f}{\longrightarrow} X

factors through a plot of YY.

This allows to grasp the restriction of field histories to the infinitesimal neighbourhood of a submanifold of spacetime, which will be crucial for the discussion of phase spaces below.

Definition

(field histories on infinitesimal neighbourhood of submanifold of spacetime)

Let EfbΣE \overset{fb}{\to} \Sigma be a field bundle (def. 34) and let SΣS \hookrightarrow \Sigma be a submanifold of spacetime.

We write N Σ(S)ΣN_\Sigma(S) \hookrightarrow \Sigma for its infinitesimal neighbourhood in Σ\Sigma (def. 27).

Then the set of field histories restricted to SS, to be denoted

(25)Γ S(E)Γ N Σ(S)(E| N ΣS)H \Gamma_{S}(E) \coloneqq \Gamma_{N_\Sigma(S)}( E\vert_{N_\Sigma S} ) \in \mathbf{H}

is the set of section of EE restricted to the infinitesimal neighbourhood N Σ(S)N_\Sigma(S) (example 27).

\,

We close the discussion of infinitesimal differential geometry by explaining how we may recover the concept of smooth manifolds inside the more general formal smooth sets (def./prop. 44 below). The key point is that the presence of infinitesimals in the theory allows an intrinsic definition of local diffeomorphisms/formally étale morphism (def. 43 and example 28 below). It is noteworthy that the only role this concept plays in the development of field theory below is that smooth manifolds admit triangulations by smooth singular simplices (def. 14) so that the concept of fiber integration of differential forms is well defined over manifolds.

Definition

(local diffeomorphism of formal smooth sets)

Let X,YX,Y be formal smooth sets (def. 40). Then a morphism between them is called a local diffeomorphism or formally étale morphism, denoted

f:XetY, f \;\colon\; X \overset{et}{\longrightarrow} Y \,,

if ff if for each infinitesimally thickened Cartesian space (def. 39) n×𝔻\mathbb{R}^n \times \mathbb{D} we have a natural identification between the n×𝔻\mathbb{R}^n \times \mathbb{D}-plots of XX with those nn×𝔻\mathbb{R}^n n\times \mathbb{D}-plots of YY whose reduction (def. 41) comes from an n\mathbb{R}^n-plot of XX, hence if we have a natural fiber product of sets of plots

X( n×𝔻)Y( n×𝔻)× fY( n)X( n) X(\mathbb{R}^n \times \mathbb{D}) \;\simeq\; Y(\mathbb{R}^n \times \mathbb{D}) \underset{Y(\mathbb{R}^n)}{\times^f} X(\mathbb{R}^n)

i. e.

X( n×𝔻) Y( n×𝔻) (pb) X( n) Y( n) \array{ && X(\mathbb{R}^n \times \mathbb{D}) \\ & \swarrow && \searrow \\ Y(\mathbb{R}^n \times \mathbb{D}) && \text{(pb)} && X(\mathbb{R}^n) \\ & \searrow && \swarrow \\ && Y(\mathbb{R}^n ) }

for all infinitesimally thickened Cartesian spaces n×𝔻\mathbb{R}^n \times \mathbb{D}.

Stated more abstractly, this means that the naturality square of the unit of the infinitesimal shape \Im (def. 41) is a pullback square

X η X X f (pb) f Y η Y Y \array{ X &\overset{\eta_X}{\longrightarrow}& \Im X \\ {}^{\mathllap{f}}\downarrow &\text{(pb)}& \downarrow^{\mathrlap{\Im f}} \\ Y &\underset{\eta_Y}{\longrightarrow}& \Im Y }

(Khavkine-Schreiber 17, def. 3.1)

Example

(local diffeomorphism between Cartesian spaces from the general definition)

For X,YCartSpX,Y \in CartSp two ordinary Cartesian spaces (def. 1), regarded as formal smooth sets by example 23 then a morphism f:XYf \colon X \to Y between them is a local diffeomorphism in the general sense of def. 43 precisely if it is so in the ordinary sense of def. 2.

(Khavkine-Schreiber 17, prop. 3.2)

Definition/Proposition

(smooth manifolds)

A smooth manifold XX of dimension nn \in \mathbb{N} is

such that

  1. there exists an indexed set { nϕ iX} iI\{ \mathbb{R}^n \overset{\phi_i}{\to} X\}_{i \in I} of morphisms of formal smooth sets (def. 40) from Cartesian spaces n\mathbb{R}^n (def. 1) (regarded as formal smooth sets via example 15, example 18 and example 24) into XX, (regarded as a formal smooth set via example 18 and example 24) such that

    1. every point xX sx \in X_s is in the image of at least one of the ϕ i\phi_i;

    2. every ϕ i\phi_i is a local diffeomorphism according to def. 43;

  2. the final topology induced by the set of plots of XX makes X sX_s a paracompact Hausdorff space.

(Khavkine-Schreiber 17, example 3.4)

For more on smooth manifolds from the perspective of formal smooth sets see at geometry of physics -- manifolds and orbifolds.

\,

fermion fields and supergeometry

Field theories of interest crucially involve fermionic fields (def. 50 below), such as the Dirac field (example 35 below), which are subject to the “Pauli exclusion principle”, a key reason for the stability of matter. Mathematically this principle means that these fields have field bundles whose field fiber is not an ordinary manifold, but an odd-graded supermanifold (more on this in remark 17 and remark 27 below).

This “supergeometry” is an immediate generalization of the infinitesimal geometry above, where now the infinitesimal elements in the algebra of functions may be equipped with a grading: one speaks of superalgebra.

The “super”-terminology for something as down-to-earth as the mathematical principle behind the stability of matter may seem unfortunate. For better or worse, this terminology has become standard since the middle of the 20th century. But the concept that today is called supercommutative superalgebra was in fact first considered by Grassmann 1844 who got it right (“Ausdehnungslehre”) but apparently was too far ahead of his time and remained unappreciated.

Beware that considering supergeometry does not necessarily involve considering “supersymmetry”. Supergeometry is the geometry of fermion fields (def. 50 below), and that all matter fields in the observable universe are fermionic has been experimentally established since the Stern-Gerlach experiment in 1922. Supersymmetry, on the other hand, is a hypothetical extension of spacetime-symmetry within the context of supergeometry. Here we do not discuss supersymmetry; for details see instead at geometry of physics -- supersymmetry.

Definition

(supercommutative superalgebra)

A real /2\mathbb{Z}/2-graded algebra or superalgebra is an associative algebra AA over the real numbers together with a direct sum decomposition of its underlying real vector space

A A evenA odd, A \simeq_{\mathbb{R}} A_{even} \oplus A_{odd} \,,

such that the product in the algebra respects the multiplication in the cyclic group of order 2 /2={even,odd}\mathbb{Z}/2 = \{even, odd\}:

A evenA even A oddA odd}A evenAAAAA oddA even A evenA odd}A odd. \left. \array{ A_{even} \cdot A_{even} \\ A_{odd} \cdot A_{odd} } \right\} \subset A_{even} \phantom{AAAA} \left. \array{ A_{odd} \cdot A_{even} \\ A_{even} \cdot A_{odd} } \right\} \subset A_{odd} \,.

This is called a supercommutative superalgebra if for all elements a 1,a 2Aa_1, a_2 \in A which are of homogeneous degree |a i|/2={even,odd}{\vert a_i \vert} \in \mathbb{Z}/2 = \{even, odd\} in that

a iA |a i|A a_i \in A_{{\vert a_i\vert}} \subset A

we have

a 1a 2=(1) |a 1||a 2|a 2a 1. a_1 \cdot a_2 = (-1)^{{\vert a_1 \vert \vert a_2 \vert}} a_2 \cdot a_1 \,.

A homomorphism of superalgebras

f:AA f \;\colon\; A \longrightarrow A'

is a homomorphism of associative algebras over the real numbers such that the /2\mathbb{Z}/2-grading is respected in that

f(A even)A evenAAAAAf(A odd)A odd. f(A_{even}) \subset A'_{even} \phantom{AAAAA} f(A_{odd}) \subset A'_{odd} \,.

For more details on superalgebra see at geometry of physics -- superalgebra.

Example

(basic examples of supercommutative superalgebras)

Basic examples of supercommutative superalgebras (def. 45) include the following:

  1. Every commutative algebra AA becomes a supercommutative superalgebra by declaring it to be all in even degree: A=A evenA = A_{even}.

  2. For VV a finite dimensional real vector space, then the Grassmann algebra A V *A \coloneqq \wedge^\bullet_{\mathbb{R}} V^\ast is a supercommutative superalgebra with A even evenV *A_{even} \coloneqq \wedge^{even} V^\ast and A odd oddV *A_{odd} \coloneqq \wedge^{odd} V^\ast.

    More explicitly, if V= sV = \mathbb{R}^s is a Cartesian space with canonical dual coordinates (θ i) i=1 s(\theta^i)_{i = 1}^s then the Grassmann algebra ( s) *\wedge^\bullet (\mathbb{R}^s)^\ast is the real algebra which is generated from the θ i\theta^i regarded in odd degree and hence subject to the relation

    θ iθ j=θ jθ i. \theta^i \cdot \theta^j = - \theta^j \cdot \theta^i \,.

    In particular this implies that all the θ i\theta^i are infinitesimal (def. 39):

    θ iθ i=0. \theta^i \cdot \theta^i = 0 \,.
  3. For A 1A_1 and A 2A_2 two supercommutative superalgebras, there is their tensor product supercommutative superalgebra A 1 A 2A_1 \otimes_{\mathbb{R}} A_2. For example for XX a smooth manifold with ordinary algebra of smooth functions C (X)C^\infty(X) regarded as a supercommutative superalgebra by the first example above, the tensor product with a Grassmann algebra (second example above) is the supercommutative superalgebta

    C (X) (( s)*) C^\infty(X) \otimes_{\mathbb{R}} \wedge^\bullet((\mathbb{R}^s)\ast)

    whose elements may uniquely be expanded in the form

    f+f iθ i+f ijθ iθ j+f ijkθ iθ jθ k++f i 1i sθ i 1θ i s, f + f_i \theta^i + f_{i j} \theta^i \theta^j + f_{i j k} \theta^i \theta^j \theta^k + \cdots + f_{i_1 \cdots i_s} \theta^{i_1} \cdots \theta^{i_s} \,,

    where the f i 1i kC (X)f_{i_1 \cdots i_k} \in C^\infty(X) are smooth functions on XX which are skew-symmetric in their indices.

The following is the direct super-algebraic analog of the definition of infinitesimally thickened Cartesian spaces (def. 39):

Definition

(super Cartesian space)

A superpoint Spec(A)Spec(A) is represented by a super-commutative superalgebra AA (def. 45) which as a /2\mathbb{Z}/2-graded vector space (super vector space) is a direct sum

A 1V A \simeq_{\mathbb{R}} \langle 1 \rangle \oplus V

of the 1-dimensional even vector space 1=\langle 1 \rangle = \mathbb{R} of multiples of 1, with a finite dimensional super vector space VV that is a nilpotent ideal in AA in that for each element aVa \in V there exists a natural number nn \in \mathbb{N} such that

a n+1=0. a^{n+1} = 0 \,.

More generally, a super Cartesian space n×Spec(A)\mathbb{R}^n \times Spec(A) is represented by a super-commutative algebra C ( n)AAlgC^\infty(\mathbb{R}^n) \otimes A \in \mathbb{R} Alg which is the tensor product of algebras of the algebra of smooth functions C ( n)C^\infty(\mathbb{R}^n) on an actual Cartesian space of some dimension nn, with an algebra of functions A 1VA \simeq_{\mathbb{R}} \langle 1\rangle \oplus V of a superpoint (example 29).

Specifically, for ss \in \mathbb{N}, there is the superpoint

(26) 0|sSpec( ( s) *) \mathbb{R}^{0 \vert s} \;\coloneqq\; Spec\left( \wedge^\bullet (\mathbb{R}^s)^\ast \right)

whose algebra of functions is by definition the Grassmann algebra on ss generators (θ i) i=1 s(\theta^i)_{i = 1}^s in odd degree (example 29).

We write

b|s b× 0|s = b×Spec( ( s) *) =Spec(C ( b) ( s) *) \begin{aligned} \mathbb{R}^{b\vert s} & \coloneqq \mathbb{R}^b \times \mathbb{R}^{0 \vert s} \\ & = \mathbb{R}^b \times Spec( \wedge^\bullet(\mathbb{R}^s)^\ast ) \\ & = Spec\left( C^\infty(\mathbb{R}^b) \otimes_{\mathbb{R}} \wedge^\bullet (\mathbb{R}^s)^\ast \right) \end{aligned}

for the corresponding super Cartesian spaces whose algebra of functions is as in example 29.

We say that a smooth function between two super Cartesian spaces

n 1×Spec(A 1)f n 2×Spec(A 2) \mathbb{R}^{n_1} \times Spec(A_1) \overset{f}{\longrightarrow} \mathbb{R}^{n_2} \times Spec(A_2)

is by definition dually a homomorphism of supercommutative superalgebras (def. 45) of the form

C ( n 1)A 1f *C ( n 2)A 2. C^\infty(\mathbb{R}^{n_1}) \otimes A_1 \overset{f^\ast}{\longleftarrow} C^\infty(\mathbb{R}^{n_2}) \otimes A_2 \,.
Example

(superpoint induced by a finite dimensional vector space)

Let VV be a finite dimensional real vector space. With V *V^\ast denoting its dual vector space write V *\wedge^\bullet V^\ast for the Grassmann algebra that it generates. This being a supercommutative algebra, it defines a superpoint (def. 46).

We denote this superpoint by

V odd 0|dim(V). V_{odd} \simeq \mathbb{R}^{0 \vert dim(V)} \,.

All the differential geometry over Cartesian space that we discussed above generalizes immediately to super Cartesian spaces (def. 46) if we stricly adhere to the super sign rule which says that whenever two odd-graded elements swap places, a minus sign is picked up. In particular we have the following generalization of the de Rham complex on Cartesian spaces discussed above.

Definition

(super differential forms on super Cartesian spaces)

For b|s\mathbb{R}^{b\vert s} a super Cartesian space (def. 46), hence the formal dual of the supercommutative superalgebra of the form

C ( b|s)=C ( b) s C^\infty(\mathbb{R}^{b\vert s}) \;=\; C^\infty(\mathbb{R}^b) \otimes_{\mathbb{R}} \wedge^\bullet \mathbb{R}^s

with canonical even-graded coordinate functions (x i) i=1 b(x^i)_{i = 1^b} and odd-graded coordinate functions (θ j) j=1 s(\theta^j)_{j = 1}^s.

Then the de Rham complex of super differential forms on b|s\mathbb{R}^{b\vert s} is, in super-generalization of def. 11, the ×(/2)\mathbb{Z} \times (\mathbb{Z}/2)-graded commutative algebra

Ω ( b|s)C ( b|s) dx 1,,dx b,dθ 1,,dθ s \Omega^\bullet(\mathbb{R}^{b|s}) \;\coloneqq\; C^\infty(\mathbb{R}^{b|s}) \otimes_{\mathbb{R}} \wedge^\bullet \langle d x^1, \cdots, d x^b, \; d \theta^1, \cdots, d\theta^s \rangle

which is generated over C ( b|s)C^\infty(\mathbb{R}^{b\vert s}) from new generators

dx ideg=(1,even)AAAAAdθ jdeg=(1,odd) \underset{ \text{deg} = (1,even) }{\underbrace{ d x^i }} \phantom{AAAAA} \underset{ \text{deg} = (1,odd) }{ \underbrace{ d \theta^j } }

whose differential is defined in degree-0 by

dffx idx i+fθ jdθ j d f \;\coloneqq\; \frac{\partial f}{\partial x^i} d x^i + \frac{\partial f}{\partial \theta^j} d \theta^j

and extended from there as a bigraded derivation of bi-degree (1,even)(1,even), in super-generalization of def. 12.

Accordingly, the operation of contraction with tangent vector fields (def. 13) has bi-degree (1,σ)(-1,\sigma) if the tangent vector has super-degree σ\sigma:

generatorbi-degree
AAx a\phantom{AA} x^a(0,even)
AAθ α\phantom{AA} \theta^\alpha(0,odd)
AAdx a\phantom{AA} dx^a(1,even)
AAdθ α\phantom{AA} d\theta^\alpha(1,odd)
derivationbi-degree
AAd\phantom{AA} d(1,even)
AAι x a\phantom{AA}\iota_{\partial x^a}(-1, even)
AAι θ α\phantom{AA}\iota_{\partial \theta^\alpha}(-1,odd)

This means that if αΩ ( b|s)\alpha \in \Omega^\bullet(\mathbb{R}^{b\vert s}) is in bidegree (n α,σ α)(n_\alpha, \sigma_\alpha), and βΩ ( b|σ)\beta \in \Omega^\bullet(\mathbb{R}^{b \vert \sigma}) is in bidegree (n β,σ β)(n_\beta, \sigma_\beta), then

αβ=(1) n αn β+σ ασ ββα. \alpha \wedge \beta \; = \; (- 1)^{n_\alpha n_\beta + \sigma_\alpha \sigma_\beta} \; \beta \wedge \alpha \,.

Hence there are two contributions to the sign picked up when exchanging two super-differential forms in the wedge product:

  1. there is a “cohomological sign” which for commuting an n 1n_1-forms past an n 2n_2-form is (1) n 1n 2(-1)^{n_1 n_2};

  2. in addition there is a “super-grading” sign which for commuting a σ 1\sigma_1-graded coordinate function past a σ 2\sigma_2-graded coordinate function (possibly under the de Rham differential) is (1) σ 1σ 2(-1)^{\sigma_1 \sigma_2}.

For example:

x a 1(dx a 2)=+(dx a 2)x a 1 x^{a_1} (dx^{a_2}) \;=\; + (dx^{a_2}) x^{a_1}
θ α(dx a)=+(dx a)θ α \theta^\alpha (dx^a) \;=\; + (dx^a) \theta^\alpha
θ α 1(dθ α 2)=(dθ α 2)θ α 1 \theta^{\alpha_1} (d\theta^{\alpha_2}) \;=\; - (d\theta^{\alpha_2}) \theta^{\alpha_1}
dx a 1dx a 2=dx a 2dx a 1 dx^{a_1} \wedge d x^{a_2} \;=\; - d x^{a_2} \wedge d x^{a_1}
dx adθ α=dθ αdx a dx^a \wedge d \theta^{\alpha} \;=\; - d\theta^{\alpha} \wedge d x^a
dθ α 1dθ α 2=+dθ α 2dθ α 1 d\theta^{\alpha_1} \wedge d \theta^{\alpha_2} \;=\; + d\theta^{\alpha_2} \wedge d \theta^{\alpha_1}

(e.g. Castellani-D’Auria-Fré 91 (II.2.106) and (II.2.109), Deligne-Freed 99, section 6)

Beware that there is also another sign rule for super differential forms used in the literature. See at signs in supergeometry for further discussion.

\,

It is clear now by direct analogy with the definition of formal smooth sets (def. 40) what the corresponding supergeometric generalization is. For definiteness we spell it out yet once more:

Definition

(super smooth set)

A super smooth set XX is

  1. for each super Cartesian space n×Spec(A)\mathbb{R}^n \times Spec(A) (def. 46) a set

    X( n×Spec(A))Set X(\mathbb{R}^n \times Spec(A)) \in Set

    to be called the set of smooth functions or plots from n×Spec(A)\mathbb{R}^n \times Spec(A) to XX;

  2. for each smooth function f: n 1×Spec(A 1) n 2×Spec(A 2)f \;\colon\; \mathbb{R}^{n_1} \times Spec(A_1) \longrightarrow \mathbb{R}^{n_2} \times Spec(A_2) between super Cartesian spaces a choice of function

    f *:X( n 2×Spec(A 2))X( n 1×Spec(A 1)) f^\ast \;\colon\; X(\mathbb{R}^{n_2} \times Spec(A_2)) \longrightarrow X(\mathbb{R}^{n_1} \times Spec(A_1))

    to be thought of as the precomposition operation

    ( n 2ΦX)f *( n 1×Spec(A 1)f n 2×Spec(A 2)ΦX) \left( \mathbb{R}^{n_2} \overset{\Phi}{\longrightarrow} X \right) \;\overset{f^\ast}{\mapsto}\; \left( \mathbb{R}^{n_1}\times Spec(A_1) \overset{f}{\to} \mathbb{R}^{n_2} \times Spec(A_2) \overset{\Phi}{\to} X \right)

such that

  1. (functoriality)

    1. If id n×Spec(A): n×Spec(A) n×Spec(A)id_{\mathbb{R}^n \times Spec(A)} \;\colon\; \mathbb{R}^n \times Spec(A) \to \mathbb{R}^n \times Spec(A) is the identity function on n×Spec(A)\mathbb{R}^n \times Spec(A), then (id n×Spec(A)) *:X( n×Spec(A))X( n×Spec(A))\left(id_{\mathbb{R}^n \times Spec(A)}\right)^\ast \;\colon\; X(\mathbb{R}^n \times Spec(A)) \to X(\mathbb{R}^n \times Spec(A)) is the identity function on the set of plots X( n×Spec(A))X(\mathbb{R}^n \times Spec(A)).

    2. If n 1×Spec(A 1)f n 2×Spec(A 2)g n 3×Spec(A 3)\mathbb{R}^{n_1}\times Spec(A_1) \overset{f}{\to} \mathbb{R}^{n_2} \times Spec(A_2) \overset{g}{\to} \mathbb{R}^{n_3} \times Spec(A_3) are two composable smooth functions between infinitesimally thickened Cartesian spaces, then pullback of plots along them consecutively equals the pullback along the composition:

      f *g *=(gf) * f^\ast \circ g^\ast = (g \circ f)^\ast

      i.e.

      X( n 2×Spec(A 2)) f * g * X( n 1×Spec(A 1)) (gf) * X( n 3×Spec(A 3)) \array{ && X(\mathbb{R}^{n_2} \times Spec(A_2)) \\ & {}^{\mathllap{f^\ast}}\swarrow && \nwarrow^{\mathrlap{g^\ast}} \\ X(\mathbb{R}^{n_1} \times Spec(A_1)) && \underset{ (g \circ f)^\ast }{\longleftarrow} && X(\mathbb{R}^{n_3} \times Spec(A_3)) }
  2. (gluing)

    If {U i×Spec(A)f i×id Spec(A) n×Spec(A)} iI\{ U_i \times Spec(A) \overset{f_i \times id_{Spec(A)}}{\to} \mathbb{R}^n \times Spec(A)\}_{i \in I} is such that

    {U if i n} iI\{ U_i \overset{f_i }{\to} \mathbb{R}^n \}_{i \in I}

    is a differentiably good open cover (def. 3) then the function which restricts n×Spec(A)\mathbb{R}^n \times Spec(A)-plots of XX to a set of U i×Spec(A)U_i \times Spec(A)-plots

    X( n×Spec(A))((f i) *) iIiIX(U i×Spec(A)) X(\mathbb{R}^n \times Spec(A)) \overset{( (f_i)^\ast )_{i \in I} }{\hookrightarrow} \underset{i \in I}{\prod} X(U_i \times Spec(A))

    is a bijection onto the set of those tuples (Φ iX(U i)) iI(\Phi_i \in X(U_i))_{i \in I} of plots, which are “matching families” in that they agree on intersections:

    ϕ i| ((U iU j)×Spec(A)=ϕ j| (U iU j)×Spec(A) \phi_i\vert_{((U_i \cap U_j) \times Spec(A)} = \phi_j \vert_{(U_i \cap U_j)\times Spec(A)}

    i.e.

    (U iU j)×Spec(A) U i×Spec(A) U j×Spec(A) Φ i Φ j X \array{ && (U_i \cap U_j) \times Spec(A) \\ & \swarrow && \searrow \\ U_i\times Spec(A) && && U_j \times Spec(A) \\ & {}_{\mathrlap{\Phi_i}}\searrow && \swarrow_{\mathrlap{\Phi_j}} \\ && X }

Finally, given X 1X_1 and X 2X_2 two super formal smooth sets, then a smooth function between them

f:X 1X 2 f \;\colon\; X_1 \longrightarrow X_2

is

  • for each super Cartesian space n×Spec(A)\mathbb{R}^n \times Spec(A) a function

    f *( n×Spec(A)):X 1( n×Spec(A))X 2( n×Spec(A)) f_\ast(\mathbb{R}^n \times Spec(A)) \;\colon\; X_1(\mathbb{R}^n \times Spec(A)) \longrightarrow X_2(\mathbb{R}^n \times Spec(A))

such that

  • for each smooth function g: n 1×Spec(A 1) n 2×Spec(A 2)g \colon \mathbb{R}^{n_1} \times Spec(A_1) \to \mathbb{R}^{n_2} \times Spec(A_2) between super Cartesian spaces we have

    g 2 *f *( n 2×Spec(A 2))=f *( n 1×Spec(A 1))g 1 * g^\ast_2 \circ f_\ast(\mathbb{R}^{n_2} \times Spec(A_2)) = f_\ast(\mathbb{R}^{n_1} \times Spec(A_1)) \circ g^\ast_1

    i.e.

    X 1( n 2×Spec(A 2)) f *( n 2×Spec(A 2)) X 2( n 2×Spec(A 2)) g 1 * g 2 * X 1( n 1×Spec(A 1)) f *( n 1) X 2( n 1×Spec(A 1)) \array{ X_1(\mathbb{R}^{n_2} \times Spec(A_2)) &\overset{f_\ast(\mathbb{R}^{n_2}\times Spec(A_2) )}{\longrightarrow}& X_2(\mathbb{R}^{n_2} \times Spec(A_2)) \\ \mathllap{g_1^\ast}\downarrow && \downarrow\mathrlap{g^\ast_2} \\ X_1(\mathbb{R}^{n_1} \times Spec(A_1)) &\underset{f_\ast(\mathbb{R}^{n_1})}{\longrightarrow}& X_2(\mathbb{R}^{n_1} \times Spec(A_1)) }

(Yetter 88)

Basing supergeometry on super formal smooth sets is an instance of the general approach to geometry called functorial geometry or topos theory. For more background on this see at geometry of physics -- supergeometry.

In direct generalization of example 15 we have:

Example

(super Cartesian spaces are super smooth sets)

Let XX be a super Cartesian space (def. 46) Then it becomes a super smooth set (def. 48) by declaring its plots ΦX( n×𝔻)\Phi \in X(\mathbb{R}^n \times \mathbb{D}) to the algebra homomorphisms C ( n×𝔻)C ( b|s) C^\infty(\mathbb{R}^n \times \mathbb{D}) \leftarrow C^\infty(\mathbb{R}^{b\vert s}).

Under this identification, morphisms between super Cartesian spaces are in natural bijection with their morphisms regarded as super smooth sets.

Stated more abstractly, this statement is an example of the Yoneda embedding over a subcanonical site.

Similarly, in direct generalization of prop. 17 we have:

Proposition

(plots of a super smooth set really are the smooth functions into the smooth smooth set)

Let XX be a super smooth set (def. 48). For n×𝔻\mathbb{R}^n \times \mathbb{D} any super Cartesian space (def. 46) there is a natural function

Hom SmoothSet( n,X)X( n) Hom_{SmoothSet}(\mathbb{R}^n , X) \overset{\simeq}{\longrightarrow} X(\mathbb{R}^n)

from the set of homomorphisms of super smooth sets from n×𝔻\mathbb{R}^n \times \mathbb{D} (regarded as a super smooth set via example 31) to XX, to the set of plots of XX over n×𝔻\mathbb{R}^n \times \mathbb{D}, given by evaluating on the identity plot id n×𝔻id_{\mathbb{R}^n \times \mathbb{D}}.

This function is a bijection.

This says that the plots of XX, which initially bootstrap XX into being as declaring the would-be smooth functions into XX, end up being the actual smooth functions into XX.

Proof

This is the statement of the Yoneda lemma over the site of super Cartesian spaces.

We do not need to consider here supermanifolds more general than the super Cartesian spaces (def. 46). But for those readers familiar with the concept we include the following direct analog of the characterization of smooth manifolds according to def./prop. 44:

Definition/Proposition

(supermanifolds)

A supermanifold XX of dimension super-dimension (b,s)×(b,s) \in \mathbb{N} \times \mathbb{N} is

such that

  1. there exists an indexed set { b|sϕ iX} iI\{ \mathbb{R}^{b\vert s} \overset{\phi_i}{\to} X\}_{i \in I} of morphisms of super smooth sets (def. 48) from super Cartesian spaces b|s\mathbb{R}^{b\vert s} (def. 46) (regarded as super smooth sets via example 31 into XX, such that

    1. for every plot n×𝔻X\mathbb{R}^n \times \mathbb{D} \to X there is a differentiably good open cover (def. 3) restricted to which the plot factors through the i b|s\mathbb{R}^{b\vert s}_i;

    2. every ϕ i\phi_i is a local diffeomorphism according to def. 43, now with respect not just to infinitesimally thickened points, but with respect to superpoints;

  2. the bosonic part of XX is a smooth manifold according to def./prop. 44.

Finally we have the evident generalization of the smooth moduli space Ω \mathbf{\Omega}^\bullet of differential forms from example 19 to supergeometry

Example

(universal smooth moduli spaces of super differential forms)

For nMn \in \mathbf{M} write

Ω nSuperSmoothSet \mathbf{\Omega}^n \;\in\; SuperSmoothSet

for the super smooth set (def. 31) whose set of plots on a super Cartesian space USuperCartSpU \in SuperCartSp (def. 46) is the set of super differential forms (def. 47) of cohomolgical degree nn

Ω n(U)Ω n(U) \mathbf{\Omega}^n(U) \;\coloneqq\; \Omega^n(U)

and whose maps of plots is given by pullback of super differential forms.

The de Rham differential on super differential forms applied plot-wise yields a morpism of super smooth sets

(27)d:Ω nΩ n+1. d \;\colon\; \mathbf{\Omega}^n \longrightarrow \mathbf{\Omega}^{n+1} \,.

As before in def. 37 we then define for any super smooth set XSuperSmoothSetX \in SuperSmoothSet its set of differential nn-forms to be

Ω n(X)Hom SuperSmoothSet(X,Ω n) \Omega^n(X) \;\coloneqq\; Hom_{SuperSmoothSet}(X,\mathbf{\Omega}^n)

and we define the de Rham differential on these to be given by postcomposition with (27).

\,

Definition

(bosonic fields and fermionic fields)

For Σ\Sigma a spacetime, such as Minkowski spacetime (def. 23) if a fiber bundle EfbΣE \overset{fb}{\longrightarrow} \Sigma with total space a super Cartesian space (def. 46) (or more generally a supermanifold, def./prop. 49) is regarded as a super-field bundle (def. 34), then

In components, if E=Σ×FE = \Sigma \times F is a trivial bundle with fiber a super Cartesian space (def. 46) with even-graded coordinates (ϕ a)(\phi^a) and odd-graded coordinates (ψ A)(\psi^A), then the ϕ a\phi^a are called the bosonic field coordinates, and the ψ A\psi^A are called the fermionic field coordinates.

What is crucial for the discussion of field theory is the following immediate supergeometric analog of the smooth structure on the space of field histories from example 16:

Example

(supergeometric space of field histories)

Let EfbΣE \overset{fb}{\to} \Sigma be a super-field bundle (def. 34, def. 50).

Then the space of sections, hence the space of field histories, is the super formal smooth set (def. 48)

Γ Σ(E)SuperSmoothSet \Gamma_\Sigma(E) \in SuperSmoothSet

whose plots Φ ()\Phi_{(-)} for a given Cartesian space n\mathbb{R}^n and superpoint 𝔻\mathbb{D} (def. 46) with the Cartesian products U n×𝔻U \coloneqq \mathbb{R}^n \times \mathbb{D} and U×ΣU \times \Sigma regarded as super smooth sets according to example 31 are defined to be the morphisms of super smooth set (def. 48)

U×Σ Φ ()() E \array{ U \times \Sigma &\overset{\Phi_{(-)}(-)}{\longrightarrow}& E }

which make the following diagram commute:

E Φ ()() fb U×Σ pr 2 Σ. \array{ && E \\ & {}^{\mathllap{\Phi_{(-)}(-)}}\nearrow & \downarrow^{\mathrlap{fb}} \\ U \times \Sigma &\underset{pr_2}{\longrightarrow}& \Sigma } \,.

Explicitly, if Σ\Sigma is a Minkowski spacetime (def. 23) and E=Σ×FE = \Sigma \times F a trivial field bundle with field fiber a super vector space (example 9, example 50) this means dually that a plot Φ ()\Phi_{(-)} of the super smooth set of field histories is a homomorphism of supercommutative superalgebras (def. 45)

C (U×Σ) (Φ ()()) * C (E) \array{ C^\infty(U \times \Sigma) &\overset{\left(\Phi_{(-)}(-)\right)^\ast}{\longleftarrow}& C^\infty(E) }

which make the following diagram commute:

C (E) (Φ ()()) * fb * C (U×Σ) pr 2 * C (Σ). \array{ && C^\infty(E) \\ & {}^{\mathllap{\left( \Phi_{(-)}(-) \right)^\ast }}\nearrow & \uparrow^{\mathrlap{fb^\ast}} \\ C^\infty(U \times \Sigma) &\underset{pr_2^\ast}{\longleftarrow}& C^\infty(\Sigma) } \,.

We will focus on discussing the supergeometric space of field histories (example 33) of the Dirac field (def. 35 below). This we consider below in example 35; but first we discuss now some relevant basics of general supergeometry.

Example 33 is really a special case of a general relative mapping space-construction as in example 25. This immediately generalizes also to the supergeometric context.

Definition

(super-mapping space out of a super Cartesian space)

Let XX be a super Cartesian space (def. 46) and let YY be a super smooth set (def. 48). Then the mapping space

[X,Y]SuperSmoothSet [X,Y] \;\in\; SuperSmoothSet

of super smooth functions from XX to YY is the super formal smooth set whose UU-plots are the morphisms of super smooth set from the Cartesian product of super Cartesian space U×XU \times X to YY, hence the U×XU \times X-plots of YY:

[X,Y](U)Y(U×X). [X,Y](U) \;\coloneqq\; Y(U \times X) \,.

In direct generalization of the synthetic tangent bundle construction (example 26) to supergeometry we have

Definition

(odd tangent bundle)

Let XX be a super smooth set (def. 48) and 0|1\mathbb{R}^{0\vert 1} the superpoint (26) then the supergeometry-mapping space

T oddX [ 0|1,X] tb odd [* 0|1,X] X = X \array{ T_{odd} X & \coloneqq& [\mathbb{R}^{0\vert 1}, X] \\ {}^{\mathllap{tb_{odd}}}\downarrow && \downarrow^{\mathrlap{ [ \ast \to \mathbb{R}^{0 \vert 1}, X ] }} \\ X & = & X }

is called the odd tangent bundle of XX.

Example

(mapping space of superpoints)

Let VV be a finite dimensional real vector space and consider its corresponding superpoint V oddV_{odd} from exampe 30. Then the mapping space (def. 51) out of the superpoint 0|1\mathbb{R}^{0\vert 1} (def. 46) into V oddV_{odd} is the Cartesian product V odd×VV_{odd} \times V

[ 0|1,V odd]V odd×V. [\mathbb{R}^{0\vert 1}, V_{odd}] \;\simeq\; V_{odd} \times V \,.

By def. 52 this says that V odd×VV_{odd} \times V is the “odd tangent bundle” of V oddV_{odd}.

Proof

Let UU be any super Cartesian space. Then by definition we have the following sequence of natural bijections of sets of plots

[ 0|1,V odd](U) =Hom SuperSmoothSet( 0|1×U,V odd) Hom sAlg( (V *),C (U)[θ]/(θ 2)) Hom Vect(V *,(C (U) oddC (U) evenθ) Hom Vect(V *,C (U) odd)×Hom Vect(V *,C (U) even) V odd(U)×V(U) (V odd×V)(U) \begin{aligned} \left[ \mathbb{R}^{0\vert 1}, V_{odd} \right](U) & = Hom_{SuperSmoothSet}( \mathbb{R}^{0\vert 1} \times U, V_{odd} ) \\ & \simeq Hom_{\mathbb{R}sAlg}( \wedge^\bullet(V^\ast)\,,\, C^\infty(U)[\theta]/(\theta^2) ) \\ & \simeq Hom_{\mathbb{R}Vect}( V^\ast \,,\, (C^\infty(U)_{odd} \oplus C^\infty(U)_{even}\langle \theta\rangle ) \\ & \simeq Hom_{\mathbb{R}Vect}( V^\ast\,,\, C^\infty(U)_{odd} ) \,\times\, Hom_{\mathbb{R}Vect}( V^\ast, C^\infty(U)_{even} ) \\ & \simeq V_{odd}(U) \times V(U) \\ & \simeq (V_{odd} \times V)(U) \end{aligned}

Here in the third line we used that the Grassmann algebra V *\wedge^\bullet V^\ast is free on its generators in V *V^\ast, meaning that a homomorphism of supercommutative superalgebras out of the Grassmann algebra is uniquely fixed by the underlying degree-preserving linear function on these generators. Since in a Grassmann algebra all the generators are in odd degree, this is equivalently a linear map from V *V^\ast to the odd-graded real vector space underlying C (U)[θ](θ 2)C^\infty(U)[\theta](\theta^2), which is the direct sum C (U) oddC (U) evenθC^\infty(U)_{odd} \oplus C^\infty(U)_{even}\langle \theta \rangle.

Then in the fourth line we used that finite direct sums are Cartesian products, so that linear maps into a direct sum are pairs of linear maps into the direct summands.

That all these bijections are natural means that they are compatible with morphisms UUU \to U' and therefore this says that [ 0|1,V odd][\mathbb{R}^{0\vert 1}, V_{odd}] and V odd×VV_{odd} \times V are the same as seen by super-smooth plots, hence that they are isomorphic as super smooth sets.

With this supergeometry in hand we finally turn to defining the Dirac field species:

Example

(field bundle for Dirac field)

For Σ\Sigma being Minkowski spacetime (def. 23), of dimension 2+12+1, 3+13+1, 5+15+1 or 9+19+1, let SS be the spin representation from prop. 15, whose underlying real vector space is

S={ 2 2 | p+1=2+1 2 2 | p+1=3+1 2 2 | p+1=5+1 𝕆 2𝕆 2 | p+1=9+1 S \;=\; \left\{ \array{ \mathbb{R}^2 \oplus \mathbb{R}^2 & \vert & p + 1 = 2+1 \\ \mathbb{C}^2 \oplus \mathbb{C}^2 &\vert& p + 1 = 3 + 1 \\ \mathbb{H}^2 \oplus \mathbb{H}^2 &\vert& p + 1 = 5 + 1 \\ \mathbb{O}^2 \oplus \mathbb{O}^2 &\vert& p + 1 = 9 + 1 } \right.

With

S odd 0|dim(S) S_{odd} \simeq \mathbb{R}^{0 \vert dim(S)}

the corresponding superpoint (example 30), then the field bundle for the Dirac field on Σ\Sigma is

EΣ×S oddpr 1Σ, E \;\coloneqq\; \Sigma \times S_{odd} \overset{pr_1}{\to} \Sigma \,,

hence the field fiber is the superpoint S oddS_{odd}. This is the corresponding spinor bundle on Minkowski spacetime, with fiber in odd super-degree.

The traditional two-component spinor basis from remark 7 provides fermionic field coordinates (def. 50) on the field fiber S oddS_{odd}:

(ψ A) A=1 4=((χ a),(ξ a˙)) a,a˙=1,2. \left( \psi^A \right)_{A = 1}^4 \;=\; \left( (\chi_a), (\xi^{\dagger \dot a}) \right)_{a,\dot a = 1,2} \,.

Notice that these are 𝕂\mathbb{K}-valued odd functions: For instance if 𝕂=\mathbb{K} = \mathbb{C} then each χ a\chi_a in turn has two components, a real part and an imaginary part.

A key point with the field bundle of the Dirac field (example 35) is that the field fiber coordinates (ψ A)(\psi^A) or ((χ a),(ξ a˙))\left((\chi_a), (\xi^{\dagger \dot a})\right) are now odd-graded elements in the function algebra on the field fiber, which is the Grassmann algebra C (S odd)= (S *)C^\infty(S_{odd}) = \wedge^\bullet(S^\ast). Therefore they anti-commute with each other:

(28)ψ αψ β=ψ βψ α. \psi^\alpha \psi^{\beta} = - \psi^{\beta} \psi^\alpha \,.

snippet grabbed from (Dermisek 09)

We analyze the special nature of the supergeometry space of field histories of the Dirac field a little (prop. 53) below and conclude by highlighting the crucial role of supergeometry (remark 10 below)

Proposition

(space of field histories of the Dirac field)

Let E=Σ×S oddpr 1ΣE = \Sigma \times S_{odd} \overset{pr_1}{\to} \Sigma be the super-field bundle (def. 50) for the Dirac field over Minkowski spacetime Σ= p,1\Sigma = \mathbb{R}^{p,1} from example 35.

Then the corresponding supergeometric space of field histories

Γ Σ(Σ×S odd)SuperSmoothSet \Gamma_\Sigma(\Sigma \times S_{odd}) \;\in\; SuperSmoothSet

from example 33 has the following properties:

  1. For U= nU = \mathbb{R}^n an ordinary Cartesian space (with no super-geometric thickening, def. 46) there is only a single UU-parameterized collection of field histories, hence a single plot

    Ψ (): n0Γ Σ(Σ×S odd) \Psi_{(-)}\;\colon\;\mathbb{R}^n \overset{ 0 }{\longrightarrow} \Gamma_\Sigma(\Sigma \times S_{odd})

    and this corresponds to the zero section, hence to the trivial Dirac field

    Ψ () A=0. \Psi^A_{(-)} = 0 \,.
  2. For U= n|1U = \mathbb{R}^{n \vert 1} a super Cartesian space (46) with a single super-odd dimension, then UU-parameterized collections of field histories

    Ψ (): n|1Γ Σ(Σ×S odd) \Psi_{(-)} \;\colon\; \mathbb{R}^{n\vert 1} \longrightarrow \Gamma_\Sigma(\Sigma \times S_{odd})

    are in natural bijection with plots of sections of the bosonic-field bundle with field fiber S even=SS_{even} = S the spin representation regarded as an ordinary vector space:

    θΨ (): nΓ Σ(Σ×S even), \theta \Psi_{(-)} \;\colon\; \mathbb{R}^n \longrightarrow \Gamma_\Sigma(\Sigma \times S_{even}) \,,

Moreover, these two kinds of plots determine the fermionic field space completely: It is in fact isomorphic, as a super vector space, to the bosonic field space shifted to odd degree (as in example 30):

Γ Σ(Σ×S odd)(Γ Σ(E×S even)) odd. \Gamma_\Sigma(\Sigma \times S_{odd}) \;\simeq\; \left( \Gamma_\Sigma(E\times S_{even}) \right)_{odd} \,.
Proof

In the first case, the plot is a morphism of super Cartesian spaces (def. 46) of the form

n× p,1S odd. \mathbb{R}^n \times \mathbb{R}^{p,1} \longrightarrow S_{odd} \,.

By definitions this is dually homomorphism of real supercommutative superalgebras

C ( n× p,1) S * C^\infty(\mathbb{R}^n \times \mathbb{R}^{p,1}) \longleftarrow \wedge^\bullet S^\ast

from the Grassmann algebra on the dual vector space of the spin representation SS to the ordinary algebras of smooth functions on n× p,1\mathbb{R}^n \times \mathbb{R}^{p,1}. But the latter has no elements in odd degree, and hence all the Grassmann generators need to be send to zero.

For the second case, notice that a morphism of the form

n|1Ψ ()S odd \mathbb{R}^{n\vert 1} \overset{\Psi_{(-)}}{\longrightarrow} S_{odd}

is by def. 52 naturally identified with a morphism of the form

n[ 0|1,S odd]S odd×S even, \mathbb{R}^n \overset{}{\longrightarrow} [\mathbb{R}^{0 \vert 1}, S_{odd}] \simeq S_{odd} \times S_{even} \,,

where the identification on the right is from example 34.

By the nature of Cartesian products these morphisms in turn are naturally identified with pairs of morphisms of the form

( n S odd, n S even). \left( \array{ \mathbb{R}^n &\overset{}{\longrightarrow}& S_{odd}\,, \\ \mathbb{R}^n &\overset{}{\longrightarrow}& S_{even} } \right) \,.

Now, as in the first point above, here the first component is uniquely fixed to be the zero morphism n0S odd\mathbb{R}^n \overset{0}{\to} S_{odd}; and hence only the second component is free to choose. This is precisely the claim to be shown.

Remark

(supergeometric nature of the Dirac field)

Proposition 53 how two basic facts about the Dirac field, which may superficially seem to be in tension with each other, are properly unified by supergeometry:

  1. On the one hand a field history Ψ\Psi of the Dirac field is not an ordinary section of an ordinary vector bundle. In particular its component functions ψ A\psi^A anti-commute with each other, which is not the case for ordinary functions, and this is crucial for the Lagrangian density of the Dirac field to be well defined, we come to this below in example 43.

  2. On the other hand a field history of the Dirac field is supposed to be a spinor, hence a section of a spinor bundle, which is an ordinary vector bundle.

Therefore prop. 53 serves to shows how, even though a Dirac field is not defined to be an ordinary section of an ordinary vector bundle, it is nevertheless encoded by such an ordinary section: One says that this ordinary section is a “superfield-component” of the Dirac field, the one linear in a Grassmann variable θ\theta.

\,

This concludes our discussion of the concept of fields itself. In the following chapter we consider the variational calculus of fields.

\,

Field variations

Given a field bundle as in def. 34 above, then we know what type of quantities the corresponding field histories assign to a given spacetime point (a given event). Among all consistent such field histories, some are to qualify as those that “may occur in reality” if we think of the field theory as a means to describe parts of the observable universe. Moreover, if the reality to be described does not exhibit “action at a distance” then admissibility of its field histories should be determined over arbitrary small spacetime regions, in fact over the infinitesimal neighbourhood of any spacetime point (remark 11 below). This means equivalently that the realized field histories should be those that satisfy a given differential equation, namely an equation between the partial derivatives of the field history at any spacetime point. This is called the equation of motion of the field theory (def. 61 below).

In order to formalize this, it is useful to first collect all the possible partial derivatives that a field history may have at any given point into one big space of “field derivatives at spacetime points”. This collection is called the jet bundle of the field bundle, given as def. 54 below.

Moving around in this space means to change the possible value of fields and their derivatives, hence to vary the fields. Accordingly variational calculus of fields is just differential calculus on the jet bundle of the field bundle, this we consider in def. 59 below.

\,

Definition

(jet bundle of a trivial vector bundle over Minkowski spacetime)

Given a field fiber super vector space F= b|sF = \mathbb{R}^{b\vert s} with linear basis (ϕ a)(\phi^a), then for kk \in \mathbb{N} a natural number, the order-kk jet bundle

J Σ k(E) jb k Σ \array{ J^k_{\Sigma}( E ) \\ \downarrow^{\mathrlap{jb_k}} \\ \Sigma }

over Minkowski spacetime Σ\Sigma of the trivial vector bundle

EΣ×F E \coloneqq \Sigma \times F

is the super Cartesian space (def. 46) which is spanned by coordinate functions to be denoted as follows:

((x μ),(ϕ a),(ϕ ,μ a),(ϕ ,μ 1μ 2 a),,(ϕ ,μ 1μ k a),) \left( (x^\mu) \,,\, (\phi^a ) \,,\, ( \phi^a_{,\mu} ) \,,\, ( \phi^a_{,\mu_1\mu_2} ) \,,\, \cdots \,,\, ( \phi^a_{,\mu_1 \cdots \mu_k} ) \,,\, \cdots \right)

where the indices μ,μ 1,μ 2,\mu, \mu_1, \mu_2, \cdots range from 0 to pp, while the index aa ranges from 11 to bb for the even field coordinates, and then from b+1b+1 to b+sb+s for the odd-graded field coordinates and the lower indices are symmetric:

(29)ϕ μ 1μ iμ jμ k a=ϕ μ 1μ jμ iμ k a. \phi^a_{\mu_1 \cdots \mu_{i} \cdots \mu_j \cdots \mu_k} \;=\; \phi^a_{\mu_1 \cdots \mu_{j} \cdots \mu_i \cdots \mu_k} \,.

In terms of these coordinates the bundle projection map jb kjb_k is just the one that remembers the spacetime coordinates x μx^\mu and forgets the values of the field ϕ a\phi^a and its derivatives ϕ μ\phi_{\mu}. Similarly there are intermediate projection maps

jb 3,2 J Σ 2(E) jb 2,1 J Σ 1(E) jb 1,0 E jb 2 jb 1 fb Σ \array{ \cdots &\overset{jb_{3,2}}{\longrightarrow}& J^{2}_\Sigma(E) &\overset{jb_{2,1}}{\longrightarrow}& J^1_\Sigma(E) &\overset{jb_{1,0}}{\longrightarrow}& E \\ && &{}_{\mathllap{jb_2}}\searrow& {}^{\mathllap{jb_1}}\downarrow &\swarrow_{\mathrlap{fb}}& \\ && && \Sigma && }

given by forgetting coordinates with more indices.

The infinite-order jet bundle

J Σ (E)SuperSmoothSet J^\infty_\Sigma(E) \in SuperSmoothSet

is the direct limit of super smooth sets (def. 48) over these finite order jet bundles. Explicitly this means that it is the smooth set which is defined by the fact that a smooth function (a plot, by prop. 18)

UfJ Σ (E) U \overset{f}{\longrightarrow} J^\infty_\Sigma(E)

from some super Cartesian space UU is equivalently a system of ordinary smooth functions into all the finite-order jet spaces

(Uf kJ Σ k(E)) k, \left( U \overset{f_k}{\longrightarrow} J^k_\Sigma(E) \right)_{k \in \mathbb{N}} \,,

such that this system is compatible with the above projection maps, i.e. such that

k(jb k+1,kf k+1=f k)AAAAAAA U f 2 f 1 f 0 jb 3,2 J Σ 2(E) jb 2,1 J Σ 1(E) jb 1 E jb 2 jb 1 fb Σ \underset{k \in \mathbb{N}}{\forall} \left( jb_{k+1,k} \circ f_{k+1} = f_k \right) \phantom{AAAAAAA} \array{ && && U && \\ && & {}^{\mathllap{f_2}}\swarrow& {}_{\mathllap{f_1}}\downarrow &\searrow^{f_0}& \\ \cdots &\overset{jb_{3,2}}{\longrightarrow}& J^{2}_\Sigma(E) &\overset{jb_{2,1}}{\longrightarrow}& J^1_\Sigma(E) &\overset{jb_1}{\longrightarrow}& E \\ && &{}_{\mathllap{jb_2}}\searrow& {}^{\mathllap{jb_1}}\downarrow &\swarrow_{\mathrlap{fb}}& \\ && && \Sigma && }

The coordinate functions ϕ μ 1μ k a\phi^a_{\mu_1 \cdots \mu_k} on a jet bundle (def. 54) are to be thought of as partial derivatives x μ 1x μ kΦ a\frac{\partial}{\partial x^{\mu_1}} \cdots \frac{\partial}{\partial x^{\mu_k}} \Phi^a of components Φ a\Phi^a of would-be field histories Φ\Phi. The power of the jet bundle is that it allows to disentangle relations between would-be partial derivatives of field history components in themselves from consideration of actual field histories. In traditional physics texts this is often done implicitly. We may make it fully explit by the operation of jet prolongation which reads in a field history and records all its partial derivatives in the form of a section of the jet bundle:

Definition

(jet prolongation)

Let EfbΣE \overset{fb}{\to} \Sigma be a field bundle (def. 34) which happens to be a trivial vector bundle over Minkowski spacetime as in example 9.

There is a smooth function from the space of sections of EE, the space of field histories (example 33) to the space of sections of the jet bundle J Σ (E)jb ΣJ^\infty_\Sigma(E) \overset{jb^\infty}{\to} \Sigma (def. 54) which records the field Φ\Phi and all its spacetimes derivatives:

Γ Σ(E) j Σ Γ Σ(J Σ (E)) (Φ a) ((Φ a),(Φ ax μ),( 2Φ ax μ 1x μ 2),). \array{ \Gamma_\Sigma(E) &\overset{j^\infty_\Sigma}{\longrightarrow}& \Gamma_\Sigma(J^\infty_\Sigma(E)) \\ (\Phi^a) &\mapsto& \left( \left( \Phi^a \right) \,,\, \left( \frac{\partial \Phi^a}{\partial x^\mu} \right) \,,\, \left( \frac{\partial^2 \Phi^a}{\partial x^{\mu_1} \partial x^{\mu_2}} \right) \,,\, \cdots \right) } \,.

This is called the operation of jet prolongation: j Σ (Φ)j^\infty_\Sigma(\Phi) is the jet prolongation of Φ\Phi.

Remark

(jet bundle in terms of synthetic differential geometry)

In terms of the infinitesimal geometry of formal smooth sets (def. 40) the jet bundle J Σ (E)jb ΣJ^\infty_\Sigma(E) \overset{jb_\infty}{\to} \Sigma (def. 54) of a field bundle EfbΣE \overset{fb}{\to}\Sigma has the following incarnation:

A section of the jet bundle over a point xΣx \in \Sigma of spacetime (an event), is equivalently a section of the original field bundle over the infinitesimal neighbourhood 𝔻 x\mathbb{D}_x of that point (example 27):

{ J Σ (E) jb {x} ΣAA}AAAA{ E fb 𝔻 x ΣAA}. \left\{ \array{ && J^\infty_\Sigma(E) \\ & \nearrow & \downarrow^{\mathrlap{jb_\infty}} \\ \{x\} &\hookrightarrow& \Sigma } \phantom{AA} \right\} \phantom{AA} \simeq \phantom{AA} \left\{ \array{ && E \\ & {}^{\mathllap{}}\nearrow & \downarrow^{\mathrlap{fb}} \\ \mathbb{D}_x &\hookrightarrow& \Sigma } \phantom{AA} \right\} \,.

Moreover, given a field history Φ\Phi, hence a section of the field bundle, then its jet prolongation j (Φ)j^\infty(\Phi) (def. 55) is that section of the jet bundle which under the above identification is simply the restriction of Φ\Phi to the infinitesimal neighbourhood of xx:

E Φ fb Σ = ΣAAAAj Σ AAAA J Σ (E) j Σ (Φ) jb Σ = ΣAAAA()| {x}AAAA E Φ| 𝔻 x fb 𝔻 x Σ. \array{ && E \\ & {}^{\mathllap{\Phi}}\nearrow & \downarrow^{\mathrlap{fb}} \\ \Sigma & = & \Sigma } \phantom{AAAA}\overset{j^\infty_\Sigma}{\mapsto} \phantom{AAAA} \array{ && J^\infty_\Sigma(E) \\ & {}^{\mathllap{j^\infty_\Sigma(\Phi)}}\nearrow & \downarrow^{\mathrlap{jb_\infty}} \\ \Sigma &=& \Sigma } \phantom{AAAA} \overset{(-)\vert_{\{x\}} }{\mapsto} \phantom{AAAA} \array{ && E \\ & {}^{\mathllap{\Phi\vert_{\mathbb{D}_x}}}\nearrow & \downarrow^{\mathrlap{fb}} \\ \mathbb{D}_x &\hookrightarrow& \Sigma } \,.

This follows with an argument as in example 20.

Hence in synthetic differential geometry we have:

The jet of a section Φ\Phi at xx is simply the restriction of that section to the infinitesimal neighbourhood of xx.

(Khavkine-Schreiber 17, section 3.3)

So the canonical coordinates on the jet bundle are the spacetime-point-wise possible values of fields and field derivates, while the jet prolongation picks the actual collections of field derivatives that may occur for an actual field history.

Example

(universal Faraday tensor/field strength on jet bundle)

Consider the field bundle (def. 34) of the electromagnetic field (example 11) over Minkowski spacetime Σ\Sigma (def. 23), i.e. the cotangent bundle E=T *ΣE = T^\ast \Sigma (def. 9) with jet coordinates ((x μ),(a μ),(a μ,ν),)((x^\mu), (a_\mu), (a_{\mu,\nu}), \cdots ) (def. 54). Consider the functions on the jet bundle given by the linear combinations

(30)f μν a [ν,μ] 12(a ν,μa μ,ν) \begin{aligned} f_{\mu \nu} & \coloneqq a_{[\nu,\mu]} \\ & \coloneqq \tfrac{1}{2}\left( a_{\nu,\mu} - a_{\mu,\nu} \right) \end{aligned}

of the first order jets.

Then for an electromagnetic field history (“vector potential”), hence a section

AΓ Σ(T *Σ)=Ω 1(Σ) A \in \Gamma_\Sigma(T^\ast \Sigma) = \Omega^1(\Sigma)

with components A *(a μ)=A μA^\ast (a_\mu) = A_\mu, its jet prolongation (def. 55)

j Σ (A)Γ Σ(J Σ (T *Σ)) j^\infty_\Sigma(A) \in \Gamma_\Sigma(J^\infty_\Sigma(T^\ast \Sigma))

has components

((A μ),(dA μdx ν),). \left( (A_\mu), \left( \frac{d A_\mu}{d x^\nu} \right) , \cdots \right) \,.

The pullback of the functions f μνf_{\mu \nu} (30) along this jet prolongation are the components of the Faraday tensor of the field (20):

(j Σ (A)) *(f μν) =F μν =(dA) μν. \begin{aligned} \left(j^\infty_\Sigma(A)\right)^\ast(f_{\mu \nu}) & = F_{\mu \nu} \\ & = (d A)_{\mu \nu} \,. \end{aligned}

More generally, for 𝔤\mathfrak{g} a Lie algebra and

ET *Σ𝔤 E \coloneqq T^\ast \Sigma \otimes \mathfrak{g}

the field bundle for Yang-Mills theory from example 12, consider the functions

f μν αΩ Σ 0,0(E)=C (J Σ (E)) f^\alpha_{\mu \nu} \;\in \; \Omega^{0,0}_\Sigma(E) = C^\infty(J^\infty_\Sigma(E))

on the jet bundle given by

(31)f μν α 12(a ν,μ αa μ,ν α+γ α βγa μ βa ν γ) \begin{aligned} f^\alpha_{\mu \nu} & \coloneqq \tfrac{1}{2} \left( a^\alpha_{\nu,\mu} - a^\alpha_{\mu,\nu} + \gamma^{\alpha}{}_{\beta \gamma} a^\beta_{\mu} a^\gamma_{\nu} \right) \end{aligned}

where (γ α βγ)(\gamma^\alpha{}_{\beta \gamma}) are the structure constants of the Lie algebra as in (21), and where the square brackets around the indices denote anti-symmetrization.

We may call this the universal Yang-Mills field strength, being the covariant exterior derivative of the universal Yang-Mills field history.

For 𝔤=\mathfrak{g} = \mathbb{R} the line Lie algebra and kk the canonical inner product on \mathbb{R} the expression (31) reduces to the universal Faraday tensor (30) for the electromagnetic field (example 36).

For AΓ Σ(T *Σ𝔤)=Ω 1(Σ,𝔤)A \in \Gamma_\Sigma(T^\ast \Sigma \otimes \mathfrak{g}) = \Omega^1(\Sigma,\mathfrak{g}) a field history of Yang-Mills theory, hence a Lie algebra-valued differential 1-form, then the value of this function on that field are called the components of the covariant exterior derivative or field strength

F μν A *(D [μa ν]) =(d AA) μν \begin{aligned} F_{\mu \nu} & \coloneqq A^\ast(D_{[\mu} a_{\nu]}) \\ & = (d_A A)_{\mu \nu} \end{aligned}
Example

(universal B-field strength on jet bundle)

Consider the field bundle (def. 34) of the B-field (example 14) over Minkowski spacetime Σ\Sigma (def. 23) with jet coordinates ((x μ),(b μν),(b μν,ρ),)((x^\mu), (b_{\mu \nu}), (b_{\mu \nu,\rho}), \cdots ) (def. 54). Consider the functions on the jet bundle given by the linear combinations

(32)h μ 1μ 2μ 3 12b [μ 1μ 2,μ 3] 16(σpermutation(1) |σ|b μ σ 1μ σ 2,μ σ 3) =b μ 1μ 2,μ 3+b μ 2μ 3,μ 1+b μ 3μ 1,μ 2, \begin{aligned} h_{\mu_1 \mu_2 \mu_3} & \coloneqq \tfrac{1}{2} b_{[\mu_1 \mu_2, \mu_3]} \\ & \coloneqq \tfrac{1}{6} \left( \underset{ \sigma \atop \text{permutation} }{\sum} (-1)^{ {\vert \sigma \vert} } b_{\mu_{\sigma_1} \mu_{\sigma_2}, \mu_{\sigma_3}} \right) \\ & = b_{\mu_1 \mu_2, \mu_3} + b_{\mu_2 \mu_3, \mu_1} + b_{\mu_3 \mu_1, \mu_2} \,, \end{aligned}

where in the last step we used that b μν=b νμb_{\mu \nu} = - b_{\nu \mu}.

\,

While the jet bundle (def. 54) is not finite dimensional, reflecting the fact that there are arbitrarily high orders of spacetime derivatives of a field histories, it turns out that it is only very “mildly infinite dimensional” in that smooth functions on jet bundles turn out to locally depend on only finitely many of the jet coordinates (i.e. only on a finite order of spacetime derivatives). This is the content of the following prop. 19.

This reflects the locality of Lagrangian field theory defined over jet bundles: If functions on the jet bundle could depend on infinitely many jet coordinates, then by Taylor series expansion of fields the function at one point over spacetime could in fact depend on field history values at a different point of spacetime. Such non-local dependence is ruled out by prop. 19 below.

In practice this means that the situation is very convenient:

  1. Any given local Lagrangian density (which will define a field theory, we come to this in def. 60 below) will locally depend on some finite number kk of derivatives and may hence locally be treated as living on the ordinary manifold J Σ k(E)J^k_\Sigma(E).

  2. while at the same time all formulas (such as for the Euler-Lagrange equations, def. 61) work uniformly without worries about fixing a maximal order of derivatives.

Proposition

(jet bundle is a locally pro-manifold)

Given a jet bundle J Σ (E)J^\infty_\Sigma(E) as in def. 54, then a smooth function out of it

J Σ (E)X J^\infty_\Sigma(E) \longrightarrow X

is such that around each point of J Σ (E)J^\infty_\Sigma(E) there is a neighbourhood UJ Σ (E)U \subset J^\infty_\Sigma(E) on which it is given by a function on a smooth function on J Σ k(E)J^k_\Sigma(E) for some finite kk.

(see Khavkine-Schreiber 17, section 2.2 and 3.3)

Example 36 shows that the de Rham differential (def. 12) may be encoded in terms of composing jet prolongation with a suitable function on the jet bundle. More generally, jet prolongation neatly encodes (possibly non-linear) differential operators:

Definition

(differential operator)

Let E 1fb 1ΣE_1 \overset{fb_1}{\to} \Sigma and E 2fb 2ΣE_2 \overset{fb_2}{\to} \Sigma be two smooth fiber bundles over a common base space Σ\Sigma. Then a (possibly non-linear) differential operator from sections of E 1E_1 to sections of E 2E_2 is a bundle morphism from the jet bundle of E 1E_1 (def. 54) to E 2E_2:

J Σ (E 1) D˜ E 2 Σ \array{ J^\infty_\Sigma(E_1) && \overset{\tilde D}{\longrightarrow} && E_2 \\ & \searrow && \swarrow \\ && \Sigma }

or rather the function DD between the spaces of sections of these bundles which this induces after composition with jet prolongation (def. 55):

D:Γ Σ(E 1)j Σ Γ Σ(J Σ (E 1))D˜()Γ Σ(E 2). D \;\colon\; \Gamma_\Sigma(E_1) \overset{j^\infty_\Sigma}{\longrightarrow} \Gamma_\Sigma(J^\infty_\Sigma(E_1)) \overset{\tilde D \circ (-)}{\longrightarrow} \Gamma_\Sigma(E_2) \,.

If both E 1E_1 and E 2E_2 are vector bundles (def. 7) so that their spaces of sections canonically are vector spaces, then DD is called a linear differential operator if it is a linear function between these vector spaces. This means equivalently that D˜\tilde D is a linear function in jet coordinates.

Definition

(normally hyperbolic differential operator on Minkowski spacetime)

Let EfbΣE \overset{fb}{\to} \Sigma be a field bundle (def. 34) which is a vector bundle (def. 7) over Minkowski spacetime (def. 23). Write E *ΣE^\ast \overset{}{\to} \Sigma for its dual vector bundle (def. 8)

A linear differential operator (def. 56)

P:Γ Σ(E)Γ Σ(E *) P \;\colon\; \Gamma_\Sigma(E) \longrightarrow \Gamma_{\Sigma}(E^\ast)

is of second order if it has a coordinate expansion of the form

(PΦ) a=P ab μν 2Φ bx μx ν+P ab μΦ bx μ+P abΦ b (P \Phi)_a \;=\; P^{\mu \nu}_{a b} \frac{\partial^2 \Phi^b}{\partial x^\mu \partial x^\nu} + P^\mu_{a b} \frac{\partial \Phi^b}{\partial x^\mu} + P_{a b} \Phi^b

for {(P ab μν),(P ab μ),P ab}\{(P^{\mu \nu}_{a b}), (P^\mu_{a b}), P_{a b}\} smooth functions on Σ\Sigma.

This is called a normally hyperbolic differential operator if its principal symbol (P ab μν)(P^{\mu \nu}_{a b}) is proportional to the inverse Minkowski metric (prop./def. 10) (η μν)(\eta^{\mu \nu}), i.e.

P ab μν=η μνQ ab. P^{\mu \nu}_{a b} = \eta^{\mu \nu} Q_{a b} \,.
Definition

(formally adjoint differential operators)

Let EfbΣE \overset{fb}{\to} \Sigma be a smooth vector bundle (def. 7) over Minkowski spacetime Σ= p,1\Sigma = \mathbb{R}^{p,1} (def. 23) and write E *ΣE^\ast \to \Sigma for the dual vector bundle (def. 8).

Then a pair of linear differential operators (def. 56) of the form

P,P *:Γ Σ(E 1)Γ Σ(E *) P, P^\ast \;\colon\; \Gamma_\Sigma(E_1) \longrightarrow \Gamma_\Sigma(E^\ast)

are called formally adjoint differential operators via a bilinear differential operator

(33)K:Γ Σ(E)Γ Σ(E)Γ Σ( pT *Σ) K \;\colon\; \Gamma_\Sigma(E) \otimes \Gamma_\Sigma(E) \longrightarrow \Gamma_\Sigma(\wedge^{p} T^\ast \Sigma)

with values in differential p-forms (def. 11) such that for all sections Φ 1,Φ 2Γ Σ(E)\Phi_1, \Phi_2 \in \Gamma_\Sigma(E) we have

(P(Φ 1)Φ 2Φ 1P *(Φ 2))dvol Σ=dK(Φ 1,Φ 2), \left( P(\Phi_1) \cdot \Phi_2 - \Phi_1 \cdot P^\ast(\Phi_2) \right) dvol_\Sigma \;=\; d K(\Phi_1, \Phi_2) \,,

where dvol Σdvol_\Sigma is the volume form on Minkowski spacetime (10) and where dd denoted the de Rham differential (def. 12).

This implies by Stokes' theorem (prop. 4) in the case of compact support that under an integral PP and P *P^\ast are related via integration by parts.

(Khavkine 14, def. 2.4)

\,

Remark

(variational calculus – replacing plain bundle morphisms by differential operators)

Various concepts in variational calculus, especially the concept of evolutionary vector fields (def. 64 below) and gauge parameterized implicit infinitesimal gauge symmetries (def. 23 below) follow from concepts in plain differential geometry by systematically replacing plain bundle morphisms by bundle morphisms out of the jet bundle, hence by differential operators D˜\tilde D as in def. 56.

Definition

(variational derivative and total spacetime derivative – the variational bicomplex)

On the jet bundle J Σ (E)J^\infty_\Sigma(E) of a trivial super vector space-vector bundle over Minkowski spacetime as in def. 54 we may consider its de Rham complex of super differential forms (def. 47); we write its de Rham differential (def. 12) in boldface:

d:Ω (J Σ (E))Ω +1(J Σ (E)). d \;\colon\; \Omega^\bullet(J^\infty_\Sigma(E)) \longrightarrow \Omega^{\bullet+1}(J^\infty_\Sigma(E)) \,.

Since the jet bundle unifies spacetime with field values, we want to decompose this differential into a contribution coming from forming the total derivatives of fields along spacetime (“horizontal derivatives”), and actual variation of fields at a fixed spacetime point (“vertical derivatives”):

The total spacetime derivative or horizontal derivative on J Σ (E)J^\infty_\Sigma(E) is the map on differential forms on the jet bundle of the form

d:Ω (J Σ (E))Ω +1(J Σ (E)) d \;\colon\; \Omega^\bullet( J^\infty_\Sigma(E) ) \longrightarrow \Omega^{\bullet+1}( J^\infty_\Sigma(E) )

which on functions f:J Σ (E)f \colon J^\infty_\Sigma(E) \to \mathbb{R} (i.e. on 0-forms) is defined by

(34)df dfdx μdx μ (fx μ+fϕ aϕ ,μ a+fϕ ,ν aϕ ,νμ a+)dx μ \begin{aligned} d f & \coloneqq \frac{d f}{d x^\mu} \mathbf{d} x^\mu \\ & \coloneqq \left( \frac{\partial f}{\partial x^\mu} + \frac{\partial f}{\partial \phi^a} \phi^a_{,\mu} + \frac{ \partial f }{ \partial \phi^a_{,\nu}} \phi^a_{,\nu \mu } + \cdots \right) \mathbf{d} x^\mu \end{aligned}

and extended to all forms by the graded Leibniz rule, hence as a nilpotent derivation of degree +1.

The variational derivative or vertical derivative

(35)δ:Ω (J Σ (E))Ω +1(J Σ (E)) \delta \;\colon\; \Omega^\bullet( J^\infty_\Sigma(E) ) \longrightarrow \Omega^{\bullet+1}( J^\infty_\Sigma(E) )

is what remains of the full de Rham differential when the total spacetime derivative (horizontal derivative) is subtracted:

(36)δdd. \delta \coloneqq \mathbf{d} - d \,.

We may then extend the horizontal derivative from functions on the jet bundle to all differential forms on the jet bundle by declaring that

dddd d \circ \mathbf{d} \;\coloneqq\; - \mathbf{d} \circ d

which by (36) is equivalent to

(37)dδ=δd. d \circ \;\delta\; = - \delta \circ d \,.

For example

dδϕ =δdϕ =δ(ϕ ,μdx μ) =δϕ ,μdx μ. \begin{aligned} d \delta \phi & = - \delta d \phi \\ & = - \delta \left( \phi_{,\mu} d x^\mu \right) \\ & = - \delta \phi_{,\mu} \wedge d x^\mu \,. \end{aligned}

This defines a bigrading on the de Rham complex of J Σ (E)J^\infty_\Sigma(E), into horizontal degree rr and vertical degree ss

Ω (J Σ (E))r,sΩ r,s(E) \Omega^\bullet\left( J^\infty_\Sigma(E) \right) \;\coloneqq\; \underset{r,s}{\oplus} \Omega^{r,s}(E)

such that the horizontal and vertical derivative increase horizontal or vertical degree, respectively:

(38)C (J Σ (E))= Ω 0,0(E) d Ω Σ 1,0(E) d Ω Σ 2,0(E) d d Ω Σ p+1,0(E) δ δ δ δ Ω Σ 0,1(E) d Ω Σ 1,1(E) d Ω Σ 2,1(E) d d Ω Σ p+1,1(E) δ δ δ δ Ω 0,2(E) d Ω 1,2(E) d Ω 2,2(E) d d Ω Σ p+1,2(E) δ δ δ δ . \array{ C^\infty(J^\infty_\Sigma(E)) = & \Omega^{0,0}(E) &\overset{d}{\longrightarrow}& \Omega^{1,0}_\Sigma(E) &\overset{d}{\longrightarrow}& \Omega^{2,0}_\Sigma(E) &\overset{d}{\longrightarrow}& \cdots &\overset{d}{\longrightarrow}& \Omega^{p+1,0}_\Sigma(E) \\ & \downarrow^{\mathrlap{\delta}} && \downarrow^{\mathrlap{\delta}} && \downarrow^{\mathrlap{\delta}} && \cdots && \downarrow^{\mathrlap{\delta}} \\ & \Omega^{0,1}_\Sigma(E) &\overset{d}{\longrightarrow}& \Omega^{1,1}_\Sigma(E) &\overset{d}{\longrightarrow}& \Omega^{2,1}_\Sigma(E) &\overset{d}{\longrightarrow}& \cdots &\overset{d}{\longrightarrow}& \Omega^{p+1,1}_\Sigma(E) \\ & \downarrow^{\mathrlap{\delta}} && \downarrow^{\mathrlap{\delta}} && \downarrow^{\mathrlap{\delta}} && \cdots && \downarrow^{\mathrlap{\delta}} \\ & \Omega^{0,2}(E) &\overset{d}{\longrightarrow}& \Omega^{1,2}(E) &\overset{d}{\longrightarrow}& \Omega^{2,2}(E) &\overset{d}{\longrightarrow}& \cdots &\overset{d}{\longrightarrow}& \Omega^{p+1,2}_\Sigma(E) \\ & \downarrow^{\mathrlap{\delta}} && \downarrow^{\mathrlap{\delta}} && \downarrow^{\mathrlap{\delta}} && \cdots && \downarrow^{\mathrlap{\delta}} \\ & \vdots && \vdots && \vdots } \,.

This is called the variational bicomplex.

Accordingly we will refer to the differential forms on the jet bundle often as variational differential forms.

\,

derivatives on jet bundle

def.symbolsname in physicsname in mathematics
def. 47d\; \mathbf{d}de Rham differentialde Rham differential
59ddx μddx μ\; d \coloneqq d x^\mu \frac{d}{d x^\mu}total spacetime derivativehorizontal derivative
59ddx μx μ+ϕ ,μ aϕ a+ \; \frac{d}{d x^\mu} \coloneqq \frac{\partial}{\partial x^\mu} + \phi^a_{,\mu} \frac{\partial}{\partial \phi^a} + \cdots total spacetime derivative
along μ\partial_\mu
horizontal derivative
along μ\partial_\mu
59δdd\; \delta \coloneqq \mathbf{d} - dvariational derivativevertical derivative
22δ ELLdL+dΘ BFV\; \delta_{EL} \mathbf{L} \coloneqq \mathbf{d}\mathbf{L} + d \Theta_{BFV}Euler-Lagrange variationEuler-Lagrange operator
85s BV\; s_{BV}BV-differentialKoszul differential
97s BRST\; s_{BRST} BRST differentialChevalley-Eilenberg differential
82s\; s BV-BRST differentialChevalley-Eilenberg-Koszul-Tate differential
122sd\; s - d local BV-BRST differential

\,

Example

(basic facts about variational calculus)

Given the jet bundle of a field bundle as in def. 54, then in its variational bicomplex (def. 59) we have the following:

  • The spacetime total derivative (horizontal derivative) of a spacetime coordinate function x μx^\mu coincides with its ordinary de Rham differential

    dx μ =x μx νdx ν =dx μ \begin{aligned} d x^\mu & = \frac{\partial x^\mu}{ \partial x^\nu} \mathbf{d}x^\nu \\ & = \mathbf{d} x^\mu \end{aligned}

    which hence is a horizontal 1-form

    dx μΩ Σ 1,0(E). \mathbf{d}x^\mu \;\in\; \Omega^{1,0}_\Sigma(E) \,.
  • Therefore the variational derivative (vertical derivative) of a spacetime coordinate function vanishes:

    (39)δx μ=0, \delta x^\mu = 0 \,,

    reflective the fact that x μx^\mu is not a field coordinate that could be varied.

  • In particular the given volume form on Σ\Sigma gives a horizontal p+1p+1-form on the jet bundle, which has the same coordinate expression (and which we denote by the same symbol)

    dvol Σ=dx 0dx 1dx pΩ p+1,0. dvol_\Sigma = d x^0 \wedge d x^1 \wedge \cdots \wedge d x^p \;\in\; \Omega^{p+1,0} \,.
  • Generally any horizontal kk-form is of the form

    f μ 1μ kdx μ 1dx μ kΩ Σ k,0(E) f_{\mu_1 \cdots \mu_k} d x^{\mu_1} \wedge \cdots \wedge d x^{\mu_k} \;\in\; \Omega^{k,0}_{\Sigma}(E)

    for

    f μ 1μ k=f μ 1μ k((x μ),(ϕ a),(ϕ ,μ a),)C (J Σ (E)) f_{\mu_1 \cdots \mu_k} = f_{\mu_1 \cdots \mu_k}\left((x^\mu), (\phi^a), (\phi^a_{,\mu}), \cdots\right) \in C^\infty(J^\infty_\Sigma(E))

    any smooth function of the spacetime coordinates and the field coordinates (locally depending only on a finite order of these, by prop. 19).

  • In particular every horizontal (p+1)(p+1)-form LΩ p+1,0(E)\mathbf{L} \in \Omega^{p+1,0}(E) is proportional to the above volume form

    L=Ldvol Σ \mathbf{L} = L \, dvol_\Sigma

    for L=L((x μ),(ϕ a),(ϕ ,μ a),)L = L((x^\mu), (\phi^a), (\phi^a_{,\mu}), \cdots) some smooth function that may depend on all the spacetime and field coordinates.

  • The spacetimes total derivatives /horizontal derivatives) of the variational derivative (vertical derivative) δϕ\delta \phi of a field variable is the differential 2-form of horizontal degree 1 and vertical degree 1 given by

    d(δϕ a) =δ(dϕ a) =(δϕ ,μ a)dx μ. \begin{aligned} d (\delta \phi^a) & = - \delta (d \phi_a) \\ & = - (\delta \phi^a_{,\mu}) \wedge \mathbf{d} x^\mu \end{aligned} \,.

    In words this says that “the spacetime derivative of the variation of the field is the variation of its spacetime derivative”.

The following are less trivial properties of variational differential forms:

Proposition

(pullback along jet prolongation compatible with total spacetime derivatives)

Let EfbΣE \overset{fb}{\to} \Sigma be a field bundle over a spacetime Σ\Sigma (def. 34), with induced jet bundle J Σ (E)J^\infty_\Sigma(E) (def. 54).

Then for ΦΓ Σ(E)\Phi \in \Gamma_\Sigma(E) any field history, the pullback of differential forms (def. 2)

j Σ (Φ) *:Ω (J Σ (E))Ω (Σ) j^\infty_\Sigma(\Phi)^\ast \;\colon\; \Omega^\bullet(J^\infty_\Sigma(E)) \longrightarrow \Omega^\bullet(\Sigma)

along the jet prolongation of Φ\Phi (def. 55)

  1. intertwines the de Rham differential on spacetime (def. 9) with the total spacetime derivative (horizontal derivative) on the jet bundle (def. 59):

    dj Σ (Φ) *=j Σ (Φ) *d. d \circ j^\infty_\Sigma(\Phi)^\ast \;=\; j^\infty_\Sigma(\Phi)^\ast \circ d \,.
  2. annihilates all vertical differential forms (def. 59):

    j Σ (Φ) *| Ω Σ r,1(E)=0. j^\infty_\Sigma(\Phi)^\ast\vert_{\Omega^{r, \geq 1}_\Sigma(E)} = 0 \,.
Proof

The operation of pullback of differential forms along any smooth function intertwines the full de Rham differentials (prop. 2). In particular we have that

dj Σ (Φ) *=j Σ (Φ) *d. d \circ j^\infty_\Sigma(\Phi)^\ast = j^\infty_\Sigma(\Phi)^\ast \circ \mathbf{d} \,.

This means that the second statement immediately follows from the first, by definition of the variational (vertical) derivative as the difference between the full de Rham differential and the horizontal one:

j Σ (Φ) *δ =j Σ (Φ) *(dd) =(dd)j Σ (Φ) * =0 \begin{aligned} j^\infty_\Sigma(\Phi)^\ast \circ \delta & = j^\infty_\Sigma(\Phi)^\ast \circ (\mathbf{d} - d) \\ & = (d - d) \circ j^\infty_\Sigma(\Phi)^\ast \\ & = 0 \end{aligned}

It remains to see the first statement:

Since the jet prolongation j Σ (Φ)j^\infty_\Sigma(\Phi) preserves the spacetime coordinates x μx^\mu (being a section of the jet bundle) it is immediate that the claimed relation is satisfied on the horizontal basis 1-forms dx μ=dx μ\mathbf{d}x^\mu = d x^\mu (example 38):

dj Σ (Φ) *(dx μ)=d 2x μ=0AAAAAj Σ (Φ) *ddx μ=j Σ (Φ) *d 2x μ. d j^\infty_\Sigma(\Phi)^\ast( \mathbf{d}x^\mu ) = d^2 x^\mu = 0 \phantom{AAAAA} j^\infty_\Sigma(\Phi)^\ast d \mathbf{d} x^\mu = j^\infty_\Sigma(\Phi)^\ast d^2 x^\mu \,.

Therefore it finally remains only to check the first statement on smooth functions (0-forms). So let

f=f((x μ),(ϕ a),(ϕ ,μ a),) f = f\left( (x^\mu) \,,\, (\phi^a) \,,\, ( \phi^a_{,\mu} ) \,,\, \cdots \right)

be a smooth function on the jet bundle. Then by the chain rule

dj Σ (Φ) *f((x μ),(ϕ a),(ϕ ,μ a),) =df((x μ),(Φ a),(Φ ax μ),) =(fx μ+fϕ aΦ ax μ+fϕ ,ν a 2Φ ax νx μ+)dx μ \begin{aligned} d j^\infty_\Sigma(\Phi)^\ast f\left( (x^\mu) \,,\, (\phi^a) \,,\, ( \phi^a_{,\mu} ) \,,\, \cdots \right) & = d f\left( (x^\mu) \,,\, (\Phi^a) \,,\, \left( \frac{\partial \Phi^a}{\partial x^\mu} \right) \,,\, \cdots \right) \\ & = \left( \frac{\partial f}{\partial x^\mu} + \frac{\partial f}{\partial \phi^a} \frac{\partial \Phi^a}{\partial x^\mu} + \frac{\partial f}{\partial \phi^a_{,\nu}} \frac{\partial^2 \Phi^a}{\partial x^\nu \partial x^\mu} + \cdots \right) d x^\mu \end{aligned}

That this is equal to j Σ (Φ) *dfj^\infty_\Sigma(\Phi)^\ast d f follows by the very definition of the total spacetime derivative of ff (34).

Proposition

(horizontal variational complex of trivial field bundle is exact)

Let EfbΣE \overset{fb}{\to} \Sigma be a field bundle which is a trivial vector bundle over Minkowski spacetime (example 9). Then the chain complex of horizontal differential forms Ω Σ s,0(E)\Omega^{s,0}_\Sigma(E) with the total spacetime derivative (horizontal derivative) dd (def. 59)

(40)Ω Σ 0,0(E)dΩ Σ 1,0(E)dΩ Σ 2,0(E)ddΩ Σ p,0(E)dΩ Σ p+1,0(E) \mathbb{R} \overset{}{\hookrightarrow} \Omega^{0,0}_\Sigma(E) \overset{d}{\longrightarrow} \Omega^{1,0}_\Sigma(E) \overset{d}{\longrightarrow} \Omega^{2,0}_\Sigma(E) \overset{d}{\longrightarrow} \cdots \overset{d}{\longrightarrow} \Omega^{p,0}_\Sigma(E) \overset{d}{\longrightarrow} \Omega^{p+1,0}_\Sigma(E)

is exact: for all 0sp0 \leq s \leq p the kernel of dd coincides with the image of dd in Ω Σ s,0(E)\Omega^{s,0}_\Sigma(E).

More explicitly, this means that not only is every horizontally exact differential form ω=dα\omega = d \alpha horizontally closed dω=0d \omega = 0 (which follows immediately from the fact that we have a cochain complex in the first place, hence that d 2=0d^2 = 0), but, conversely, if ωΩ Σ 0sp,0(E)\omega \in \Omega^{0 \leq s \leq p,0}_\Sigma(E) satisfies dω=0d \omega = 0, then there exists αΩ Σ s1,0(E)\alpha \in \Omega^{s-1,0}_\Sigma(E) with ω=dα\omega = d \alpha.

(e.g. Anderson 89, prop. 4.3)

We will encounter the extension of the exact sequence (40) further steps to the right below in example 50.

\,

This concludes our discussion of variational calculus on the jet bundle of the field bundle. In the next chapter we apply this to Lagrangian densities on the jet bundle, defining Lagrangian field theories.

\,

Lagrangians

Given any type of fields (def. 34), those field histories that are to be regarded as “physically realizable” (if we think of the field theory as a description of the observable universe) should satisfy some differential equation – the equation of motion – meaning that realizability of any field histories may be checked upon restricting the configuration to the infinitesimal neighbourhoods (example 27) of each spacetime point. This expresses the physical absence of “action at a distance” and is one aspect of what it means to have a local field theory. By remark 11 this means that equations of motion of a field theory are equations among the coordinates of the jet bundle of the field bundle.

For many field theories of interest, their differential equation of motion is not a random partial differential equations, but is of the special kind that exhibits the “principle of extremal action” (prop. 45 below) determined by a local Lagrangian density (def. 60 below). These are called Lagrangian field theories, and this is what we consider here.

Namely among all the variational differential forms (def. 59) two kinds stand out, namley the 0-forms in Ω Σ 0,0(E)\Omega^{0,0}_\Sigma(E) – the smooth functions – and the horizontal p+1p+1-forms Ω Σ p+1,0(E)\Omega^{p+1,0}_\Sigma(E) – to be called the Lagrangian densities L\mathbf{L} (def. 60 below) – since these occupy the two “corners” of the variational bicomplex (38). There is not much to say about the 0-forms, but the Lagrangian densities L\mathbf{L} do inherit special structure from their special position in the variational bicomplex:

Their variational derivative δL\delta \mathbf{L} uniquely decomposes as

  1. the Euler-Lagrange derivative δ ELL\delta_{EL} \mathbf{L} which is proportional to the variation of the fields (instead of their derivatives)

  2. the total spacetime derivative dΘ BFVd \Theta_{BFV} of a potential Θ BFV\Theta_{BFV} for a presymplectic current Ω BFVδΘ BFV\Omega_{BFV} \coloneqq \delta \Theta_{BFV}.

This is prop. 22 below:

δL=δ ELLEuler-Lagrange variationdΘ BFVpresymplectic current. \delta \mathbf{L} \;=\; \underset{ \text{Euler-Lagrange variation} }{\underbrace{\delta_{EL}\mathbf{L}}} - d \underset{\text{presymplectic current}}{\underbrace{\Theta_{BFV}}} \,.

These two terms play a pivotal role in the theory: The condition that the first term vanishes on field histories is a differential equation on field histories, called the Euler-Lagrange equation of motion (def. 61 below). The space of solutions to this differential equation, called the on-shell space of field histories

(41)Γ Σ(E) δ ELL=0AAAΓ Σ(E) \Gamma_\Sigma(E)_{\delta_{EL}\mathbf{L} = 0} \overset{\phantom{AAA}}{\hookrightarrow} \Gamma_\Sigma(E)

has the interpretation of the space of “physically realizable field histories”. This is the key object of study in the following chapters. Often this is referred to as the space of classical field histories, indicating that this does not yet reflect the full quantum field theory.

Indeed, there is also the second term in the variational derivative of the Lagrangian density, the presymplectic current Θ BFV\Theta_{BFV}, and this implies a presymplectic structure on the on-shell space of field histories (def. 88 below) which encodes deformations of the algebra of smooth functions on Γ Σ(E)\Gamma_\Sigma(E). This deformation is the quantization of the field theory to an actual quantum field theory, which we discuss below.

δL = δ ELL dΘ BFV classical field theory deformation to quantum field theory \array{ &&& \delta \mathbf{L} \\ &&& = \\ & & \delta_{EL}\mathbf{L} &- & d \Theta_{BFV} & \\ & \swarrow && && \searrow \\ \array{ \text{classical} \\ \text{field theory} } && && && \array{ \text{deformation to} \\ \text{quantum} \\ \text{field theory} } }

\,

Definition

(local Lagrangian density)

Given a field bundle EE over a (p+1)(p+1)-dimensional Minkowski spacetime Σ\Sigma as in example 9, then a local Lagrangian density L\mathbf{L} (for the type of field thus defined) is a horizontal differential form of degree (p+1)(p+1) (def. 59) on the corresponding jet bundle (def. 54):

LΩ Σ p+1,0(E). \mathbf{L} \;\in \; \Omega^{p+1,0}_{\Sigma}(E) \,.

By example 38 in terms of the given volume form on spacetimes, any such Lagrangian density may uniquely be written as

L=Ldvol Σ \mathbf{L} = L \, dvol_\Sigma

where the coefficient function (the Lagrangian function) is a smooth function on the spacetime and field coordinates:

L=L((x μ),(ϕ a),(ϕ ,μ a),). L = L((x^\mu), (\phi^a), (\phi^a_{,\mu}), \cdots ) \,.

where by prop. 19 L((x μ),)L((x^\mu), \cdots) depends locally on an arbitrary but finite order of derivatives ϕ ,μ 1μ k a\phi^a_{,\mu_1 \cdots \mu_k}.

We say that a field bundle EfbΣE \overset{fb}{\to} \Sigma (def. 34) equipped with a local Lagrangian density L\mathbf{L} is (or defines) a prequantum Lagrangian field theory on the spacetime Σ\Sigma.

Remark

(parameterized and physical unit-less Lagrangian densities)

More generally we may consider parameterized collections of Lagrangian densities, i.e. functions

L ():UΩ Σ p+1,0(E) \mathbf{L}_{(-)} \;\colon\; U \longrightarrow \Omega^{p+1,0}_\Sigma(E)

for UU some Cartesian space or generally some super Cartesian space.

For example all Lagrangian densities considered in relativistic field theory are naturally smooth functions of the scale of the metric η\eta (def. 10)

>0 Ω Σ p+1,0(E) r L r 2η \array{ \mathbb{R}_{\gt 0} &\overset{}{\longrightarrow}& \Omega^{p+1,0}_\Sigma(E) \\ r &\mapsto& \mathbf{L}_{r^2\eta} }

But by the discussion in remark 3, in physics a rescaling of the metric is interpreted as reflecting but a change of physical units of length/distance. Hence if a Lagrangian density is supposed to express intrinsic content of a physical theory, it should remain unchanged under such a change of physical units.

This is achieved by having the Lagrangian be parameterized by further parameters, whose corresponding physical units compensate that of the metric such as to make the Lagrangian density “physical unit-less”.

This means to consider parameter spaces UU equipped with an action of the multiplicative group >0\mathbb{R}_{\gt 0} of positive real numbers, and parameterized Lagrangians

L ():UΩ Σ p+1,0(E) \mathbf{L}_{(-)} \;\colon\; U \longrightarrow \Omega^{p+1,0}_\Sigma(E)

which are invariant under this action.

Remark

(locally variational field theory and Lagrangian p-gerbe connection)

If the field bundle (def. 34) is not just a trivial vector bundle over Minkowski spacetime (example 9) then a Lagrangian density for a given equation of motion may not exist as a globally defined differential (p+1)(p+1)-form, but only as a p-gerbe connection. This is the case for locally variational field theories such as the charged particle, the WZW model and generally theories involving higher WZW terms. For more on this see the exposition at Higher Structures in Physics.

Example

(local Lagrangian density for free real scalar field on Minkowski spacetime)

Consider the field bundle for the real scalar field from example 10, i.e. the trivial line bundle over Minkowski spacetime.

According to def. 54 its jet bundle J Σ (E)J^\infty_\Sigma(E) has canonical coordinates

{{x μ},ϕ,{ϕ ,μ},{ϕ ,μ 1μ 2},}. \left\{ \{x^\mu\}, \phi, \{\phi_{,\mu}\}, \{\phi_{,\mu_1 \mu_2}\}, \cdots \right\} \,.

In these coordinates, the local Lagrangian density LΩ p+1,0(Σ)L \in \Omega^{p+1,0}(\Sigma) (def. 60) defining the free real scalar field of mass mm \in \mathbb{R} on Σ\Sigma is

L12(η μνϕ ,μϕ ,νm 2ϕ 2)dvol Σ. L \coloneqq \tfrac{1}{2} \left( \eta^{\mu \nu} \phi_{,\mu} \phi_{,\nu} - m^2 \phi^2 \right) \mathrm{dvol}_\Sigma \,.

This is naturally thought of as a collection of Lagrangians smoothly parameterized by the metric η\eta and the mass mm. For this to be physical unit-free in the sense of remark 13 the physical unit of the parameter mm must be that of the inverse metric, hence must be an inverse length according to remark 3 This is the inverse Compton wavelength m=/mc\ell_m = \hbar / m c (9) and hence the physical unit-free version of the Lagrangian density for the free scalar particle is

L η, m m 22(η μνϕ ,μϕ ,ν(mc) 2ϕ 2)dvol Σ. \mathbf{L}_{\eta,\ell_m} \:\coloneqq\; \tfrac{\ell_m^2}{2} \left( \eta^{\mu \nu} \phi_{,\mu} \phi_{,\nu} - \left( \tfrac{m c}{\hbar} \right)^2 \phi^2 \right) \mathrm{dvol}_\Sigma \,.
Example

(local Lagrangian density for free electromagnetism)

Consider the field bundle T *ΣΣT^\ast \Sigma \to \Sigma for the electromagnetic field on Minkowski spacetime from example 11, i.e. the cotangent bundle, which over Minkowski spacetime happens to be a trivial vector bundle of rank p+1p+1. With fiber coordinates taken to be (a μ) μ=0 p(a_\mu)_{\mu = 0}^p, the induced fiber coordinates on the corresponding jet bundle J Σ (T *Σ)J^\infty_\Sigma(T^\ast \Sigma) (def. 54) are ((x μ),(a μ),(a μ,ν),(a μ,ν 1ν 2),)( (x^\mu), (a_\mu), (a_{\mu,\nu}), (a_{\mu,\nu_1 \nu_2}), \cdots ).

Consider then the local Lagrangian density (def. 60) given by

(42)L12f μνf μνdvol ΣΩ Σ p+1,0(T *Σ), \mathbf{L} \;\coloneqq\; \tfrac{1}{2} f_{\mu \nu} f^{\mu \nu} dvol_\Sigma \;\in\; \Omega^{p+1,0}_\Sigma(T^\ast \Sigma) \,,

where f μν12(a ν,μa μ,ν)f_{\mu \nu} \coloneqq \tfrac{1}{2}(a_{\nu,\mu} - a_{\mu,\nu}) are the components of the universal Faraday tensor on the jet bundle from example 36.

This is the Lagrangian density that defines the Lagrangian field theory of free electromagnetism.

Here for AΓ Σ(T *Σ)A \in \Gamma_\Sigma(T^\ast \Sigma) an electromagnetic field history (vector potential), then the pullback of f μνf_{\mu \nu} along its jet prolongation (def. 55) is the corresponding component of the Faraday tensor (20):

(j Σ (A)) *(f μν) =(dA) μν =F μν \begin{aligned} \left( j^\infty_\Sigma(A) \right)^\ast(f_{\mu \nu}) & = (d A)_{\mu \nu} \\ & = F_{\mu \nu} \end{aligned}

It follows that the pullback of the Lagrangian (42) along the jet prologation of the electromagnetic field is

(j Σ (A)) *L =12F μνF μνdvol Σ =12F ηF \begin{aligned} \left( j^\infty_\Sigma(A) \right)^\ast \mathbf{L} & = \tfrac{1}{2} F_{\mu \nu} F^{\mu \nu} dvol_\Sigma \\ & = \tfrac{1}{2} F \wedge \star_\eta F \end{aligned}

Here η\star_\eta denotes the Hodge star operator of Minkowski spacetime.

More generally:

Example

(Lagrangian density for Yang-Mills theory on Minkowski spacetime)

Let 𝔤\mathfrak{g} be a finite dimensional Lie algebra which is semisimple. This means that the Killing form invariant polynomial

k:𝔤𝔤 k \colon \mathfrak{g} \otimes \mathfrak{g} \longrightarrow \mathbb{R}

is a non-degenerate bilinear form. Examples include the special unitary Lie algebras 𝔰𝔬(n)\mathfrak{so}(n).

Then for E=T *Σ𝔤E = T^\ast \Sigma \otimes \mathfrak{g} the field bundle for Yang-Mills theory as in example 12, the Lagrangian density (def. 60) 𝔤\mathfrak{g}-Yang-Mills theory on Minkowski spacetime is

L12k αβf μν αf βμνdvol ΣΩ Σ p+1,0(T *Σ), \mathbf{L} \;\coloneqq\; \tfrac{1}{2} k_{\alpha \beta} f^\alpha_{\mu \nu} f^{\beta \mu \nu} dvol_\Sigma \;\in\; \Omega^{p+1,0}_\Sigma(T^\ast \Sigma) \,,

where

f μν α=12(a ν,μ αa μ,ν α+γ α βγa μ βa ν γ)Ω Σ 0,0(E) f^\alpha_{\mu \nu} \;=\; \tfrac{1}{2} \left( a^\alpha_{\nu,\mu} - a^\alpha_{\mu,\nu} + \gamma^{\alpha}{}_{\beta \gamma} a^\beta_{\mu} a^\gamma_{\nu} \right) \;\in\; \Omega^{0,0}_\Sigma(E)

is the universal Yang-Mills field strength (31).

Example

(local Lagrangian density for free B-field)

Consider the field bundle Σ 2T *ΣΣ\wedge^2_\Sigma T^\ast \Sigma \to \Sigma for the B-field on Minkowski spacetime from example 14. With fiber coordinates taken to be (b μν)(b_{\mu \nu}) with

b μν=b νμ, b_{\mu \nu} = - b_{\nu \mu} \,,

the induced fiber coordinates on the corresponding jet bundle J Σ (T *Σ)J^\infty_\Sigma(T^\ast \Sigma) (def. 54) are ((x μ),(b μν),(b μν,μ 1),(b μν,μ 1μ 2),)( (x^\mu), (b_{\mu \nu}), (b_{\mu \nu, \mu_1}), (b_{\mu \nu, \mu_1 \mu_2}), \cdots ).

Consider then the local Lagrangian density (def. 60) given by

(43)L12h μ 1μ 2μ 3h μ 1μ 2μ 3dvol ΣΩ Σ p+1,0( Σ 2T *Σ), \mathbf{L} \;\coloneqq\; \tfrac{1}{2} h_{\mu_1 \mu_2 \mu_3} h^{\mu_1 \mu_2 \mu_3} \, dvol_\Sigma \;\in\; \Omega^{p+1,0}_\Sigma(\wedge^2_\Sigma T^\ast \Sigma) \,,

where h μ 1μ 2μ 3h_{\mu_1 \mu_2 \mu_3} are the components of the universal B-field strength on the jet bundle from example 37.

Example

(Lagrangian density for free Dirac field on Minkowski spacetime)

For Σ\Sigma Minkowski spacetime of dimension p+1{3,4,6,10}p + 1 \in \{3,4,6,10\} (def. 23), consider the field bundle Σ×S oddΣ\Sigma \times S_{odd} \to \Sigma for the Dirac field from example 35. With the two-component spinor field fiber coordinates from remark 7, the jet bundle has induced fiber coordinates as follows:

((ψ α),(ψ ,μ α),)=(((χ a),(χ a,μ),),((ξ a˙),(ξ ,μ a˙),)) \left( \left(\psi^\alpha\right) , \left( \psi^\alpha_{,\mu} \right) , \cdots \right) \;=\; \left( \left( (\chi_a), (\chi_{a,\mu}), \cdots \right), \left( ( \xi^{\dagger \dot a}), (\xi^{\dagger \dot a}_{,\mu}), \cdots \right) \right)

All of these are odd-graded elements (def. 45) in a Grassmann algebra (example 29), hence anti-commute with each other, in generalization of (28):

(44)ψ ,μ 1μ r αψ ,μ 1μ s β=ψ ,μ 1μ s βψ ,μ 1μ r α. \psi^\alpha_{,\mu_1 \cdots \mu_r} \psi^\beta_{,\mu_1 \cdots \mu_s} \;=\; - \psi^\beta_{,\mu_1 \cdots \mu_s} \psi^\alpha_{,\mu_1 \cdots \mu_r} \,.

The Lagrangian density (def. 60) of the massless free Dirac field on Minkowski spacetime is

(45)Lψ¯γ μψ ,μdvol Σ, \mathbf{L} \;\coloneqq\; \overline{\psi} \, \gamma^\mu \psi_{,\mu}\, dvol_\Sigma \,,

given by the bilinear pairing ()¯Γ()\overline{(-)}\Gamma(-) from prop. 16 of the field coordinate with its first spacetime derivative and expressed here in two-component spinor field coordinates as in (15), hence with the Dirac conjugate ψ¯\overline{\psi} (14) on the left.

Specifically in spacetime dimension p+1=4p + 1 = 4, the Lagrangian function for the massive Dirac field of mass mm \in \mathbb{R} is

L iψ¯γ μψ ,μkinetic term+mψ¯ψmass term \begin{aligned} L & \coloneqq \underset{ \text{kinetic term} }{ \underbrace{ i \, \overline{\psi} \, \gamma^\mu \, \psi_{,\mu} } } + \underset{ \text{mass term} }{ \underbrace{ m \overline{\psi} \psi }} \end{aligned}

This is naturally thought of as a collection of Lagrangians smoothly parameterized by the metric η\eta and the mass mm. For this to be physical unit-free in the sense of remark 13 the physical unit of the parameter mm must be that of the inverse metric, hence must be an inverse length according to remark 3 This is the inverse Compton wavelength m=/mc\ell_m = \hbar / m c (9) and hence the physical unit-free version of the Lagrangian density for the free Dirac field is

L η, m m(iψ¯γ μψ ,μ+(mc)ψ¯ψ)dvol Σ. \mathbf{L}_{\eta,\ell_m} \;\coloneqq\; \ell_m \left( i \overline{\psi} \gamma^\mu \psi_{,\mu} + \left( \tfrac{m c}{\hbar} \right) \overline{\psi} \psi \right) dvol_\Sigma \,.
Remark

(reality of the Lagrangian density of the Dirac field)

The kinetic term of the Lagrangian density for the Dirac field form def. 43 is a sum of two contributions, one for each chiral spinor component in the full Dirac spinor (remark 7):

iψ¯γ μψ ,μ =iξ aσ ac˙ μ μξ c˙( μξ a)σ ac˙ μξ c˙+ μ(χ aσ ac˙ μχ c˙)+ξ a˙ σ˜ μa˙c μξ c =ξ σ˜ μ μξ+χ σ˜ μ μχ+ μ(ξσ μξ ) \begin{aligned} i \overline{\psi} \gamma^\mu \psi_{,\mu} & = i \underset{ -(\partial_\mu \xi^a ) \sigma^\mu_{a \dot c} \xi^{\dagger \dot c} + \partial_\mu(\chi^a \sigma^\mu_{a \dot c} \chi^{\dagger \dot c}) }{ \underbrace{ \xi^a \sigma^\mu_{a \dot c} \partial_\mu \xi^{\dagger \dot c} } } + \xi^\dagger_{\dot a} \tilde \sigma^{\mu \dot a c} \partial_\mu \xi_c \\ & = \xi^\dagger \tilde \sigma^\mu \partial_\mu \xi + \chi^\dagger \tilde \sigma^\mu \partial_\mu \chi + \partial_\mu(\xi \sigma^\mu \xi^\dagger) \end{aligned}

Here the computation shown under the brace crucially uses that all these jet coordinates for the Dirac field are anti-commuting, due to their supergeometric nature (44).

Notice that a priori this is a function on the jet bundle with values in 𝕂\mathbb{K}. But in fact for 𝕂=\mathbb{K} = \mathbb{C} it is real up to a total spacetime derivative:, because

(iχ σ˜ μ μχ) =i( μχ) σ μχ =iχ σ μ μχ+i μ(χ σ μχ) \begin{aligned} \left( i \chi^\dagger \tilde \sigma^\mu \partial_\mu\chi \right)^\dagger & = -i \left( \partial_\mu \chi\right)^\dagger \sigma^\mu \chi \\ & = i \chi^\dagger \sigma^\mu \partial_\mu \chi + i \partial_\mu\left( \chi^\dagger \sigma^\mu \chi \right) \end{aligned}

and similarly for iξ σ˜ μ μξi \xi^\dagger \tilde \sigma^\mu \partial_\mu\xi

(e.g. Dermisek I-9)

\,

The beauty of Lagrangian field theory (def. 60) is that a choice of Lagrangian density determines both the equations of motion of the fields as well as a presymplectic structure on the space of solutions to this equation (the “shell”), making it the “covariant phase space” of the theory. All this we discuss below. But in fact all this key structure of the field theory is nothing but the shadow (under “transgression of variational differential forms”, def. 82 below) of the following simple relation in the variational bicomplex:

Proposition

(Euler-Lagrange form and presymplectic current)

Given a Lagrangian density LΩ Σ p+1,0(E)\mathbf{L} \in \Omega^{p+1,0}_\Sigma(E) as in def. 60, then its de Rham differential dL\mathbf{d}\mathbf{L}, which by degree reasons equals δL\delta \mathbf{L}, has a unique decomposition as a sum of two terms

(46)dL=δ ELLdΘ BFV \mathbf{d} \mathbf{L} = \delta_{EL} \mathbf{L} - d \Theta_{BFV}

such that δ ELL\delta_{EL}\mathbf{L} is proportional to the variational derivative of the fields (but not their derivatives, called a “source form”):

δ ELLΩ Σ p+1,0(E)δC (E)Ω Σ p+1,1(E). \delta_{EL} \mathbf{L} \;\in\; \Omega^{p+1,0}_{\Sigma}(E) \wedge \delta C^\infty(E) \;\subset\; \Omega^{p+1,1}_{\Sigma}(E) \,.

The map

δ EL:Ω Σ p+1,0(E)Ω Σ p+1,0(E)δΩ Σ 0,0(E) \delta_{EL} \;\colon\; \Omega^{p+1,0}_{\Sigma}(E) \longrightarrow \Omega^{p+1,0}_{\Sigma}(E) \wedge \delta \Omega^{0,0}_{\Sigma}(E)

thus defined is called the Euler-Lagrange operator and is explicitly given by the Euler-Lagrange derivative:

(47)δ ELLdvol Σ δ ELLδϕ aδϕ advol Σ (Lϕ addx μLϕ ,μ a+d 2dx μ 1dx μ 2Lϕ μ 1,μ 2 a)δϕ advol Σ. \begin{aligned} \delta_{EL} L \, dvol_\Sigma & \coloneqq \frac{\delta_{EL} L}{\delta \phi^a} \delta \phi^a \wedge dvol_\Sigma \\ & \coloneqq \left( \frac{\partial L}{\partial \phi^a} - \frac{d}{d x^\mu} \frac{\partial L}{\partial \phi^a_{,\mu}} + \frac{d^2}{d x^{\mu_1} d x^{\mu_2}} \frac{\partial L}{\partial \phi^a_{\mu_1, \mu_2}} - \cdots \right) \delta \phi^a \wedge dvol_\Sigma \,. \end{aligned}

The smooth subspace of the jet bundle on which the Euler-Lagrange form vanishes

(48){xJ Σ (E)|δ ELL(x)=0}i J Σ (E). \mathcal{E} \;\coloneqq\; \left\{ x \in J^\infty_\Sigma(E) \;\vert\; \delta_{EL}\mathbf{L}(x) = 0 \right\} \;\overset{i_{\mathcal{E}}}{\hookrightarrow}\; J^\infty_\Sigma(E) \,.

is called the shell. The smaller subspace on which also all total spacetime derivatives vanish (the “formally integrable prolongation”) is the prolonged shell

(49) {xJ Σ (E)|(d kdx μ 1dx μ kδ ELL)(x)=0}i J Σ (E). \mathcal{E}^\infty \;\coloneqq\; \left\{ x \in J^\infty_\Sigma(E) \;\vert\; \left( \frac{d^k}{d x^{\mu_1} \cdots d x^{\mu_k}} \delta_{EL}\mathbf{L} \right)(x) = 0 \right\} \overset{i_{\mathcal{E}^\infty}}{\hookrightarrow} J^\infty_\Sigma(E) \,.

Saying something holds “on-shell” is to mean that it holds after restriction to this subspace. For example a variational differential form αΩ Σ ,(E)\alpha \in \Omega^{\bullet,\bullet}_\Sigma(E) is said to vanish on shell if α| =0\alpha\vert_{\mathcal{E}^\infty} = 0.

The remaining term dΘ BFVd \Theta_{BFV} in (46) is unique, while the presymplectic potential

(50)Θ BFVΩ Σ p,1(E) \Theta_{BFV} \in \Omega^{p,1}_{\Sigma}(E)

is not unique.

(For a field bundle which is a trivial vector bundle (example 9 over Minkowski spacetime (def. 23), prop. 21 says that Θ BFV\Theta_{BFV} is unique up to addition of total spacetime derivatives dκd \kappa, for κΩ Σ p1,1(E)\kappa \in \Omega^{p-1,1}_\Sigma(E).)

One possible choice for the presymplectic current Θ BFV\Theta_{BFV} is

(51)Θ BFV +Lϕ ,μ aδϕ aι μdvol Σ =+(Lϕ ,νμ aδϕ ,ν addx νLϕ ,μν aδϕ ,μ a)ι μdvol Σ =+, \begin{aligned} \Theta_{BFV} & \coloneqq \phantom{+} \frac{\partial L}{\partial \phi^a_{,\mu}} \delta \phi^a \; \wedge \iota_{\partial_\mu} dvol_\Sigma \\ & \phantom{=} + \left( \frac{\partial L}{\partial \phi^a_{,\nu \mu}} \delta \phi^a_{,\nu} - \frac{d}{d x^\nu} \frac{\partial L}{\partial \phi^a_{,\mu \nu}} \delta \phi^a_{,\mu} \right) \wedge \iota_{\partial_\mu} dvol_\Sigma \\ & \phantom{=} + \cdots \,, \end{aligned}

where

ι μdvol Σ(1) μdx 0dx μ1dx μ+1dx p \iota_{\partial_{\mu}} dvol_\Sigma \;\coloneqq\; (-1)^{\mu} d x^0 \wedge \cdots d x^{\mu-1} \wedge d x^{\mu+1} \wedge \cdots \wedge d x^p

denotes the contraction (def. 13) of the volume form with the vector field μ\partial_\mu.

The vertical derivative of a chosen presymplectic potential Θ BFV\Theta_{BFV} is called a pre-symplectic current for L\mathbf{L}:

(52)Ω BFVδΘ BFVΩ Σ p,2(E). \Omega_{BFV} \;\coloneqq\; \delta \Theta_{BFV} \;\;\; \in \Omega^{p,2}_{\Sigma}(E) \,.

Given a choice of Θ BFV\Theta_{BFV} then the sum

(53)L+Θ BFVΩ Σ p+1,0(E)Ω Σ p,1(E) \mathbf{L} + \Theta_{BFV} \;\in\; \Omega^{p+1,0}_\Sigma(E) \oplus \Omega^{p,1}_\Sigma(E)

is called the corresponding Lepage form. Its de Rham derivative is the sum of the Euler-Lagrange variation and the presymplectic current:

(54)d(L+Θ BFV)=δ ELL+Ω BFV. \mathbf{d}( \mathbf{L} + \Theta_{BFV} ) \;=\; \delta_{EL} \mathbf{L} + \Omega_{BFV} \,.

(Its conceptual nature will be elucidated after the introduction of the local BV-complex in example 75 below.)

Proof

Using L=Ldvol Σ\mathbf{L} = L dvol_\Sigma and that dL=0d \mathbf{L} = 0 by degree reasons (example 38), we find

dL =(Lϕ aδϕ a+Lϕ ,μ aδϕ ,μ a+Lϕ ,μ 1μ 2 aδϕ ,μ 1μ 2 a+)dvol Σ. \begin{aligned} \mathbf{d}\mathbf{L} & = \left( \frac{\partial L}{\partial \phi^a} \delta \phi^a + \frac{\partial L}{\partial \phi^a_{,\mu}} \delta \phi^a_{,\mu} + \frac{\partial L}{\partial \phi^a_{,\mu_1 \mu_2}} \delta \phi^a_{,\mu_1 \mu_2} + \cdots \right) \wedge dvol_{\Sigma} \end{aligned} \,.

The idea now is to have dΘ BFVd \Theta_{BFV} pick up those terms that would appear as boundary terms under the integral Σj Σ (Φ) *dL\int_\Sigma j^\infty_\Sigma(\Phi)^\ast \mathbf{d}L if we were to consider integration by parts to remove spacetime derivatives of δϕ a\delta \phi^a.

We compute, using example 38, the total horizontal derivative of Θ BFV\Theta_{BFV} from (51) as follows:

dΘ BFV =(d(Lϕ ,μ aδϕ a)+d(Lϕ ,νμ aδϕ ,ν addx νLϕ μν aδϕ a)+)ι μdvol Σ =(((dLϕ ,μ a)δϕ aLϕ ,μ aδdϕ a)+((dLϕ ,νμ a)δϕ ,ν aLϕ ,νμ aδdϕ ,ν a(dddx νLϕ ,μν a)δϕ a+ddx νLϕ ,μν aδdϕ a)+)ι μdvol Σ =((ddx μLϕ ,μ aδϕ a+Lϕ ,μ aδϕ ,μ a)+(ddx μLϕ ,νμ aδϕ ,ν a+Lϕ ,νμ aδϕ ,νμ ad 2dx μdx νLϕ ,μν aδϕ addx νLϕ ,μν aδϕ ,μ a)+)dvol Σ, \begin{aligned} d \Theta_{BFV} & = \left( d \left( \frac{\partial L}{\partial \phi^a_{,\mu}} \delta \phi^a \right) + d \left( \frac{\partial L}{\partial \phi^a_{,\nu \mu}} \delta \phi^a_{,\nu} - \frac{d}{d x^\nu} \frac{\partial L}{\partial \phi^a_{\mu \nu}} \delta \phi^a \right) + \cdots \right) \wedge \iota_{\partial_\mu} dvol_\Sigma \\ & = \left( \left( \left( d \frac{\partial L}{\partial \phi^a_{,\mu}} \right) \wedge \delta \phi^a - \frac{\partial L}{\partial \phi^a_{,\mu}} \delta d \phi^a \right) + \left( \left(d \frac{\partial L}{\partial \phi^a_{,\nu \mu}}\right) \wedge \delta \phi^a_{,\nu} - \frac{\partial L}{\partial \phi^a_{,\nu \mu}} \delta d \phi^a_{,\nu} - \left( d \frac{d}{d x^\nu} \frac{\partial L}{\partial \phi^a_{,\mu \nu}} \right) \wedge \delta \phi^a + \frac{d}{d x^\nu} \frac{\partial L}{\partial \phi^a_{,\mu \nu}} \delta d \phi^a \right) + \cdots \right) \wedge \iota_{\partial_\mu} dvol_\Sigma \\ & = - \left( \left( \frac{d}{d x^\mu} \frac{\partial L}{\partial \phi^a_{,\mu}} \delta \phi^a + \frac{\partial L}{\partial \phi^a_{,\mu}} \delta \phi^a_{,\mu} \right) + \left( \frac{d}{d x^\mu} \frac{\partial L}{\partial \phi^a_{,\nu \mu}} \delta \phi^a_{,\nu} + \frac{\partial L}{\partial \phi^a_{,\nu \mu}} \delta \phi^a_{,\nu \mu} - \frac{d^2}{ d x^\mu d x^\nu} \frac{\partial L}{\partial \phi^a_{,\mu \nu}} \delta \phi^a - \frac{d}{d x^\nu} \frac{\partial L}{\partial \phi^a_{,\mu \nu}} \delta \phi^a_{,\mu} \right) + \cdots \right) \wedge dvol_\Sigma \,, \end{aligned}

where in the last line we used that

dx μ 1ι μ 2dvol Σ={dvol Σ | ifμ 1=μ 2 0 | otherwise d x^{\mu_1} \wedge \iota_{\partial_{\mu_2}} dvol_\Sigma = \left\{ \array{ dvol_\Sigma &\vert& \text{if}\, \mu_1 = \mu_2 \\ 0 &\vert& \text{otherwise} } \right.

Here the two terms proportional to ddx νLϕ ,μν aδϕ ,μ a\frac{d}{d x^\nu} \frac{\partial L}{\partial \phi^a_{,\mu \nu}} \delta \phi^a_{,\mu} cancel out, and we are left with

dΘ BFV=(ddx μLϕ ,μ ad 2dx μdx νLϕ ,μν a+)δϕ advol Σ(Lϕ ,μ aδϕ ,μ a+Lϕ ,νμ aδϕ ,νμ a+)dvol Σ d \Theta_{BFV} \;=\; - \left( \frac{d}{d x^\mu} \frac{\partial L}{\partial \phi^a_{,\mu}} - \frac{d^2}{ d x^\mu d x^\nu} \frac{\partial L}{\partial \phi^a_{,\mu \nu}} + \cdots \right) \delta \phi^a \wedge dvol_\Sigma - \left( \frac{\partial L}{\partial \phi^a_{,\mu}} \delta \phi^a_{,\mu} + \frac{\partial L}{\partial \phi^a_{,\nu \mu}} \delta \phi^a_{,\nu \mu} + \cdots \right) \wedge dvol_\Sigma

Hence dΘ BFV-d \Theta_{BFV} shares with dL\mathbf{d} \mathbf{L} the terms that are proportional to δϕ ,μ 1μ k a\delta \phi^a_{,\mu_1 \cdots \mu_k} for k1k \geq 1, and so the remaining terms are proportional to δϕ a\delta \phi^a, as claimed:

dL+dΘ BFV=(Lϕ addx μLϕ ,μ a+d 2dx μdx νLϕ ,μν a+)δϕ advol Σ=δ ELL. \mathbf{d}L + d \Theta_{BFV} = \underset{ = \delta_{EL}\mathbf{L} }{ \underbrace{ \left( \frac{\partial L}{\partial \phi^a} - \frac{d}{d x^\mu}\frac{\partial L}{\partial \phi^a_{,\mu}} + \frac{d^2}{d x^\mu d x^\nu} \frac{\partial L}{\partial \phi^a_{,\mu\nu}} + \cdots \right) \delta \phi^a \wedge dvol_\Sigma }} \,.

The following fact is immediate from prop. 22, but of central importance, we futher amplify this in remark 16 below:

Proposition

(total spacetime derivative of presymplectic current vanishes on-shell)

Let (E,L)(E,\mathbf{L}) be a Lagrangian field theory (def. 60). Then the Euler-Lagrange form δ ELL\delta_{EL} \mathbf{L} and the presymplectic current (prop. 22) are related by

dΩ BFV=δ(δ ELL). d \Omega_{BFV} = - \delta(\delta_{EL}\mathbf{L}) \,.

In particular this means that restricted to the prolonged shell J Σ (E)\mathcal{E}^\infty \hookrightarrow J^\infty_\Sigma(E) (49) the total spacetime derivative of the presymplectic current vanishes:

(55)dΩ BFV| =0. d \Omega_{BFV} \vert_{\mathcal{E}^\infty} \;=\; 0 \,.
Proof

By prop. 22 we have

δL=δ ELLdΘ BFV. \delta \mathbf{L} = \delta_{EL} \mathbf{L} - d \Theta_{BFV} \,.

The claim follows from applying the variational derivative δ\delta to both sides, using (37): δ 2=0\delta^2 = 0 and δd=dδ\delta \circ d = - d \circ \delta.

Many examples of interest fall into the following two special cases of prop. 22:

Example

(Euler-Lagrange form for spacetime-independent Lagrangian densities)

Let (E,L)(E,\mathbf{L}) be a Lagrangian field theory (def. 60) whose field bundle EE is a trivial vector bundle EΣ×FE \simeq \Sigma \times F over Minkowski spacetime Σ\Sigma (example 9).

In general the Lagrangian density L\mathbf{L} is a function of all the spacetime and field coordinates

L=L((x μ),(ϕ a),(ϕ ,μ a),)dvol Σ. \mathbf{L} = L((x^\mu), (\phi^a), (\phi^a_{,\mu}), \cdots) dvol_\Sigma \,.

Consider the special case that L\mathbf{L} is spacetime-independent in that the Lagrangian funtion LL is independent of the spacetime coordinate (x μ)(x^\mu). Then the same evidently holds for the Euler-Lagrange form δ ELL\delta_{EL}\mathbf{L} (prop. 22). Therefore in this case the shell (49) is itself a trivial bundle over spacetime.

In this situation every point φ\varphi in the jet fiber defines a constant section of the shell:

(56)Σ×{φ} . \Sigma \times \{\varphi\} \subset \mathcal{E}^\infty \,.
Example

(canonical momentum)

Consider a Lagrangian field theory (E,L)(E, \mathbf{L}) (def. 60) whose Lagrangian density L\mathbf{L}

  1. does not depend on the spacetime-coordinates (example 24);

  2. depends on spacetime derivatives of field coordinates (hence on jet bundle coordinates) at most to first order.

Hence if the field bundle EfbΣE \overset{fb}{\to} \Sigma is a trivial vector bundle over Minkowski spacetime (example 9) this means to consider the case that

L=L((ϕ a),(ϕ ,μ a))dvol Σ. \mathbf{L} \;=\; L\left( (\phi^a), (\phi^a_{,\mu}) \right) \wedge dvol_\Sigma \,.

Then the presymplectic current (def. 22) is (up to possibly a horizontally exact part) of the form

(57)Ω BFV=δp a μδϕ aι μdvol Σ \Omega_{BFV} \;=\; \delta p_a^\mu \wedge \delta \phi^a \wedge \iota_{\partial_\mu} dvol_\Sigma

where

(58)p a μLϕ ,μ a p_a^\mu \;\coloneqq\; \frac{\partial L}{ \partial \phi^a_{,\mu}}

denotes the partial derivative of the Lagrangian function with respect to the spacetime-derivatives of the field coordinates.

Here

p a p a 0 =Lϕ ,0 a \begin{aligned} p_a & \coloneqq p_a^0 \\ & = \frac{\partial L}{\partial \phi^a_{,0}} \end{aligned}

is called the canonical momentum corresponding to the “canonical field coordinateϕ a\phi^a.

In the language of multisymplectic geometry the full expression

p a μι μdvol ΣΩ Σ p,1(E) p_a^\mu \wedge \iota_{\partial_\mu} dvol_\Sigma \;\in\; \Omega^{p,1}_\Sigma(E)

is also called the “canonical multi-momentum”, or similar.

Proof

We compute:

dL =(Lϕ aδϕ a+Lϕ ,μ aδϕ ,μ a)δϕ advol Σ =(Lϕ addx μLϕ ,μ a)dvol Σd(Lϕ ,μ aδϕ a)ι μdvol ΣΘ BFV. \begin{aligned} \mathbf{d} \mathbf{L} & = \left( \frac{\partial L}{\partial \phi^a} \delta \phi^a + \frac{\partial L}{\partial \phi^a_{,\mu}} \delta \phi^a_{,\mu} \right) \delta \phi^a \wedge dvol_\Sigma \\ & = \left( \frac{\partial L}{\partial \phi^a} - \frac{d}{d x^\mu} \frac{\partial L}{\partial \phi^a_{,\mu}} \right) \wedge dvol_\Sigma - d \underset{ \Theta_{BFV} }{ \underbrace{ \left( \frac{\partial L}{\partial \phi^a_{,\mu}} \delta \phi^a \right) \wedge \iota_{\partial_\mu} dvol_\Sigma } } \end{aligned} \,.

Hence

Ω BFV δΘ BFV =δ(Lϕ ,μ aδϕ ,μ aι μdvol Σ) =δLϕ ,μ aδϕ ,μ aι μdvol Σ =δp a μδϕ aι μdvol Σ \begin{aligned} \Omega_{BFV} & \coloneqq \delta \Theta_{BFV} \\ & = \delta \left( \frac{\partial L}{\partial \phi^a_{,\mu}} \delta \phi^a_{,\mu} \wedge \iota_{\partial_\mu} dvol_\Sigma \right) \\ & = \delta \frac{\partial L}{\partial \phi^a_{,\mu}} \wedge \delta \phi^a_{,\mu} \wedge \iota_{\partial_\mu} dvol_\Sigma \\ & = \delta p_a^\mu \wedge \delta \phi^a \wedge \iota_{\partial_\mu} dvol_\Sigma \end{aligned}
Remark

(presymplectic current is local version of (pre-)symplectic form of Hamiltonian mechanics)

In the simple but very common situation of example 44 the presymplectic current (def. 22) takes the form (58)

Ω BFV=δp a μδϕ aι μdvol Σ \Omega_{BFV} \;=\; \delta p_a^\mu \wedge \delta \phi^a \wedge \iota_{\partial_\mu} dvol_\Sigma

with ϕ a\phi^a the field coordinates (“canonical coordinates”) and p a μp_a^\mu the “canonical momentum(58).

Notice that this is of the schematic form “(δp aδq a)dvol Σ p(\delta p_a \wedge \delta q^a) \wedge dvol_{\Sigma_p}”, which is reminiscent of the wedge product of a symplectic form expressed in Darboux coordinates with a volume form for a pp-dimensional manifold. Indeed, below in Phase space we discuss that this presymplectic currenttransgresses” (def. 82 below) to a presymplectic form of the schematic form “dP adQ ad P_a \wedge d Q^a” on the on-shell space of field histories (def. 61) by integrating it over a Cauchy surface of dimension pp. In good situations this presymplectic form is in fact a symplectic form on the on-shell space of field histories (theorem 2 below).

This shows that the presymplectic current Ω BFV\Omega_{BFV} is the local (i.e. jet level) avatar of the symplectic form that governs the formulation of Hamiltonian mechanics in terms of symplectic geometry.

In fact prop. 23 may be read as saying that the presymplectic current is a conserved current (def. 66 below), only that it takes values not in smooth functions of the field coordinates and jets, but in variational 2-forms on fields. There is a conserved charge associated with every conserved current (prop. 48 below) and the conserved charge associated with the presymplectic current is the (pre-)symplectic form on the phase space of the field theory (def. 88 below).

Example

(Euler-Lagrange form and presymplectic current for free real scalar field)

Consider the Lagrangian field theory of the free real scalar field from example 39.

Then the Euler-Lagrange form and presymplectic current (prop. 22) are

(59)δ ELL=(η μνϕ ,μνm 2)δϕdvol σΩ Σ p+1,1(E). \delta_{EL}\mathbf{L} \;=\; \left(\eta^{\mu \nu} \phi_{,\mu \nu} - m^2 \right) \delta \phi \wedge dvol_\sigma \;\in\; \Omega^{p+1,1}_{\Sigma}(E) \,.

and

Ω BFV=(η μνδϕ ,μδϕ)ι νdvol ΣΩ Σ p,2(E), \Omega_{BFV} \;=\; \left(\eta^{\mu \nu} \delta \phi_{,\mu} \wedge \delta \phi \right) \wedge \iota_{\partial_\nu} dvol_{\Sigma} \;\in\; \Omega^{p,2}_{\Sigma}(E) \,,

respectively.

Proof

This is a special case of example 44, but we spell it out in detail again:

We need to show that Euler-Lagrange operator δ EL:Ω p+1,0(Σ)Ω S p+1,1(Σ)\delta_{EL} \colon \Omega^{p+1,0}(\Sigma) \to \Omega^{p+1,1}_S(\Sigma) takes the local Lagrangian density for the free scalar field to

δ ELL=(η μνϕ ,μνm 2ϕ)δϕdvol Σ. \delta_EL L \;=\; \left( \eta^{\mu \nu} \phi_{,\mu \nu} - m^2 \phi \right) \delta \phi \wedge \mathrm{dvol}_\Sigma \,.

First of all, using just the variational derivative (vertical derivative) δ\delta is a graded derivation, the result of applying it to the local Lagrangian density is

δL=(η μνϕ ,μδϕ ,νm 2ϕδϕ)dvol Σ. \delta L \;=\; \left( \eta^{\mu \nu} \phi_{,\mu} \delta \phi_{,\nu} - m^2 \phi \delta \phi \right) \wedge \mathrm{dvol}_\Sigma \,.

By definition of the Euler-Lagrange operator, in order to find δ ELL\delta_{EL}\mathbf{L} and Θ BFV\Theta_{BFV}, we need to exhibit this as the sum of the form ()δϕdΘ BFV(-) \wedge \delta \phi - d \Theta_{BFV}.

The key to find Θ BFV\Theta_{BFV} is to realize δϕ ,νdvol Σ\delta \phi_{,\nu}\wedge \mathrm{dvol}_\Sigma as a total spacetime derivative (horizontal derivative). Since dϕ=ϕ ,μdx μd \phi = \phi_{,\mu} d x^\mu this is accomplished by

δϕ ,νdvol Σ=δdϕι νdvol Σ, \delta \phi_{,\nu} \wedge \mathrm{dvol}_\Sigma = \delta d \phi \wedge \iota_{\partial_\nu} \mathrm{dvol}_\Sigma \,,

where on the right we have the contraction (def. 13) of the tangent vector field along x νx^\nu into the volume form.

Hence we may take the presymplectic potential (50) of the free scalar field to be

(60)Θ BFVη μνϕ ,μδϕι νdvol Σ, \Theta_{BFV} \coloneqq \eta^{\mu \nu} \phi_{,\mu} \delta \phi \wedge \iota_{\partial_\nu} \mathrm{dvol}_\Sigma \,,

because with this we have

dΘ BFV=η μν(ϕ ,μνδϕη μνϕ ,μδϕ ,ν)dvol Σ. d \Theta_{BFV} = \eta^{\mu \nu} \left( \phi_{,\mu \nu} \delta \phi - \eta^{\mu \nu} \phi_{,\mu} \delta \phi_{,\nu} \right) \wedge \mathrm{dvol}_\Sigma \,.

In conclusion this yields the decomposition of the vertical differential of the Lagrangian density

δL=(η μνϕ ,μνm 2ϕ)δϕdvol Σ=δ ELdΘ BFV, \delta L = \underset{ = \delta_{EL} \mathcal{L} }{ \underbrace{ \left( \eta^{\mu \nu} \phi_{,\mu \nu} - m^2 \phi \right) \delta \phi \wedge \mathrm{dvol}_\Sigma } } - d \Theta_{BFV} \,,

which shows that δ ELL\delta_{EL} L is as claimed, and that Θ BFV\Theta_{BFV} is a presymplectic potential current (50). Hence the presymplectic current itself is

Ω BFV δΘ BFV =δ(η μνϕ ,μδϕι νdvol Σ) =(η μνδϕ ,μδϕ)ι νdvol Σ. \begin{aligned} \Omega_{BFV} &\coloneqq \delta \Theta_{BFV} \\ & = \delta \left( \eta^{\mu \nu} \phi_{,\mu} \delta \phi \wedge \iota_{\partial_\nu} \mathrm{dvol}_\Sigma \right) \\ & = \left(\eta^{\mu \nu} \delta \phi_{,\mu} \wedge \delta \phi \right) \wedge \iota_{\partial_\nu} dvol_{\Sigma} \end{aligned} \,.
Example

(Euler-Lagrange form for free electromagnetic field)

Consider the Lagrangian field theory of free electromagnetism from example 40.

The Euler-Lagrange variational derivative is

δ ELL=f ,μ μνδa ν. \delta_{EL} \mathbf{L} \;=\; f^{\mu \nu}_{,\mu} \delta a_\nu \,.

Hence the shell (48) in this case is

=Σ×{((a μ),(a μ,μ 1),(a μ,μ 1μ 2),)|f μν ,μ=0}J Σ (T *Σ). \mathcal{E} = \Sigma \times \left\{ \left( (a_\mu) , (a_{\mu,\mu_1}), (a_{\mu,\mu_1 \mu_2}), \cdots \right) \;\vert\; f^{\mu \nu}{}_{,\mu} = 0 \right\} \;\subset\; J^\infty_\Sigma(T^\ast \Sigma) \,.
Proof

By (47) we have

δ ELLδa μδa μ =(a μ12a [μ,ν]a [μ,ν]=0ddx ρa α,ρ12a [μ,ν]a [μ,ν])δa α =12(ddx ρa α,ρa μ,νa [μ,ν])δa α =(ddx ρa [α,ρ])δa α =f μν ,μδa ν. \begin{aligned} \frac{\delta_{EL} L}{\delta a_\mu} \delta a_\mu & = \left( \underset{ = 0 }{ \underbrace{ \frac{\partial}{\partial a_\mu} \tfrac{1}{2} a_{[\mu,\nu]} a^{[\mu,\nu]} } } - \frac{d}{d x^\rho} \frac{\partial}{\partial a_{\alpha,\rho}} \tfrac{1}{2} a_{[\mu,\nu]} a^{[\mu,\nu]} \right) \delta a_\alpha \\ & = - \tfrac{1}{2} \left( \frac{d}{d x^\rho} \frac{\partial}{\partial a_{\alpha,\rho}} a_{\mu,\nu} a^{[\mu,\nu]} \right) \delta a_\alpha \\ & = - \left( \frac{d}{d x^\rho} a^{[\alpha,\rho]} \right) \delta a_{\alpha} \\ & = - f^{\mu \nu}{}_{,\mu} \delta a_{\nu} \,. \end{aligned}

More generally:

Example

(Euler-Lagrange form for Yang-Mills theory on Minkowski spacetime)

Let 𝔤\mathfrak{g} be a semisimple Lie algebra and consider the Lagrangian field theory (E,L)(E,\mathbf{L}) of 𝔤\mathfrak{g}-Yang-Mills theory from example 41.

Its Euler-Lagrange form (prop. 22) is

δ ELL =(f ,μ μνα+γ α βγa μ βf μνγ)k αβδa μ βdvol Σ, \begin{aligned} \delta_{EL}\mathbf{L} & = \left( f^{\mu \nu \alpha}_{,\mu} + \gamma^\alpha{}_{\beta \gamma} a_\mu^\beta f^{\mu \nu \gamma} \right) k_{\alpha \beta} \,\delta a_\mu^\beta \, dvol_\Sigma \,, \end{aligned}

where

f μν αΩ Σ 0,0(E) f^\alpha_{\mu \nu} \;\in\; \Omega^{0,0}_\Sigma(E)

is the universal Yang-Mills field strength (31).

Proof

With the explicit form (47) for the Euler-Lagrange derivative we compute as follows:

δ EL(12k αβf μν αf βμν) =((a μ α(a ν,μ α+12γ α α 2α 3a μ α 2a ν α 3))k αβf βμν(ddx νa μ,ν α(a ν,μ α+12γ α α 2α 3a μ α 2a ν α 3))k αβf βμν)δa μ α =γ α αα 3a ν α 3f βμνk αβδa μ α(ddx μf βμν)k αβδa ν α =(f ,μ αμν+γ α βγa μ βf γμν)k αβδa ν β \begin{aligned} \delta_{EL} \left( \tfrac{1}{2} k_{\alpha \beta} f^\alpha_{\mu\nu} f^{\beta \mu \nu} \right) & = \left( \left( \frac{\partial}{\partial a_{\mu'}^{\alpha'}} \left( a_{\nu,\mu}^\alpha + \tfrac{1}{2} \gamma^{\alpha}{}_{\alpha_2 \alpha_3} a_{\mu}^{\alpha_2} a_\nu^{\alpha_3} \right) \right) k_{\alpha \beta} f^{\beta \mu \nu} - \left( \frac{d}{d x^{\nu'}} \frac{\partial}{\partial a_{\mu',\nu'}^{\alpha'}} \left( a_{\nu,\mu}^\alpha + \tfrac{1}{2} \gamma^{\alpha}{}_{\alpha_2 \alpha_3} a_{\mu}^{\alpha_2} a_\nu^{\alpha_3} \right) \right) k_{\alpha \beta} f^{\beta \mu \nu} \right) \delta a_{\mu'}^{\alpha'} \\ & = \gamma^{\alpha}{}_{\alpha' \alpha_3} a_\nu^{\alpha_3} f^{\beta \mu \nu} k_{\alpha \beta} \delta a_{\mu}^{\alpha'} - \left( \frac{d}{d x^{\mu}} f^{\beta \mu \nu} \right) k_{\alpha \beta} \delta a_{\nu}^{\alpha} \\ &= - \left( f^{\alpha \mu \nu}_{,\mu} + \gamma^\alpha{}_{\beta \gamma} a_\mu^\beta f^{\gamma \mu \nu} \right) k_{\alpha \beta} \delta a_\nu^\beta \end{aligned}

In the last step we used that for a semisimple Lie algebra γ αβγk ααγ α βγ\gamma_{\alpha \beta \gamma} \coloneqq k_{\alpha \alpha'} \gamma^{\alpha'}{}_{\beta \gamma} is totally skew-symmetric in its indices (this being the coefficients of the Lie algebra cocycle) which is in transgression with the Killing form invariant polynomial kk.

Example

(Euler-Lagrange form of free B-field)

Consider the Lagrangian field theory of the free B-field from example 14.

The Euler-Lagrange variational derivative is

δ ELL=h μνρ ,ρδb μν, \delta_{EL} \mathbf{L} \;=\; h^{\mu \nu \rho}{}_{,\rho} \delta b_{\mu \nu} \,,

where h μ 1μ 2μ 3h_{\mu_1 \mu_2 \mu_3} is the universal B-field strength from example 37.

Proof

By (47) we have

δ ELLδb μνδb μν =(b μν12b [μ 1μ 2,μ 3]b [μ 1μ 2,μ 3]=0ddx ρb μν,ρ12b [μ 1μ 2,μ 3]b [μ 1μ 2,μ 3])δb μν =(ddx ρb μν,ρ12b μ 1μ 2,μ 3b [μ 1μ 2,μ 3])δb μν =(ddx ρb [μν,ρ])δb μν =h μνρ ,ρδb μν. \begin{aligned} \frac{\delta_{EL} L}{\delta b_{\mu \nu}} \delta b_{\mu \nu} & = \left( \underset{ = 0 }{ \underbrace{ \frac{\partial}{\partial b_{\mu \nu}} \tfrac{1}{2} b_{[\mu_1 \mu_2, \mu_3]} b^{[\mu_1 \mu_2, \mu_3]} } } - \frac{d}{d x^\rho} \frac{\partial}{\partial b_{\mu \nu, \rho}} \tfrac{1}{2} b_{[\mu_1 \mu_2, \mu_3]} b^{[\mu_1 \mu_2, \mu_3]} \right) \delta b_{\mu \nu} \\ & = - \left( \frac{d}{d x^\rho} \frac{\partial}{\partial b_{\mu \nu, \rho}} \tfrac{1}{2} b_{\mu_1 \mu_2, \mu_3} b^{[\mu_1 \mu_2, \mu_3]} \right) \delta b_{\mu \nu} \\ & = - \left( \frac{d}{d x^\rho} b^{[\mu \nu, \rho]} \right) \delta b_{\mu \nu} \\ & = - h^{\mu \nu \rho}{}_{,\rho} \delta b_{\mu \nu} \,. \end{aligned}
Example

(Euler-Lagrange form and presymplectic current of Dirac field)

Consider the Lagrangian field theory of the Dirac field on Minkowski spacetime of dimension p+1{3,4,6,10}p + 1 \in \{3,4,6,10\} (example 43).

Then

  • the Euler-Lagrange variational derivative (def. 22) in the case of vanishing mass mm is

    δ ELL=2iδψ¯γ μψ ,μdvol Σ \delta_{EL} \mathbf{L} \;=\; 2 i\, \overline{\delta \psi} \,\gamma^\mu\, \psi_{,\mu} \, \wedge dvol_\Sigma

    and in the case that spacetime dimension is p+1=4p +1 = 4 and arbitrary mass mm\in \mathbb{R}, it is

    δ ELL=(δψ¯(iγ μψ ,μ+mψ)+(iγ μψ ,μ¯+mψ¯)(δψ))dvol Σ \delta_{EL} \mathbf{L} \;=\; \left( \overline{\delta \psi} \left( i \gamma^\mu \psi_{,\mu} + m \psi \right) + \left( - i \gamma^\mu\overline{\psi_{,\mu}} + m \overline{\psi} \right) (\delta \psi) \right) \, dvol_\Sigma
  • its presymplectic current (def. 22) is

    Ω BFV=δψ¯γ μδψι μdvol Σ \Omega_{BFV} \;=\; \overline{\delta \psi}\,\gamma^\mu \,\delta \psi \, \iota_{\partial_\mu} dvol_\Sigma
Proof

In any case the canonical momentum of the Dirac field according to example 44 is

p μ α ψ ,μ α(iψ¯γ νψ ,ν+mψ¯ψ) =ψ¯ β(γ μ) β α \begin{aligned} p^\alpha_\mu & \coloneqq \frac{\partial }{\partial \psi^\alpha_{,\mu}} \left( i \overline {\psi} \, \gamma^\nu \, \psi_{,\nu} + m \overline{\psi} \psi \right) \\ & = \overline{\psi}^\beta (\gamma^\mu)_\beta{}^\alpha \end{aligned}

This yields the presymplectic current as claimed, by example 44.

Now regarding the Euler-Lagrange form, first consider the massless case in spacetime dimension p+1{3,4,6,10}p+1 \in \{3,4,6,10\}, where

L=iψ¯γ μψ ,μ. L \;=\; i \overline{\psi} \, \gamma^\mu \, \psi_{,\mu} \,.

Then we compute as follows:

δ ELL =iδψ¯γ μψ ,μiψ ,μ¯γ μδψ=+iδψ¯γ μψ ,μ =2iδψ¯γ μψ ,μ \begin{aligned} \delta_{EL} L & = i \,\overline{\delta \psi} \, \gamma^\mu \, \psi_{,\mu} \underset{ = + i \,\overline{\delta \psi} \, \gamma^\mu \, \psi_{,\mu} }{ \underbrace{ - i \overline{\psi_{,\mu}} \, \gamma^\mu \, \delta \psi } } \\ & = 2 i \, \overline{\delta \psi} \, \gamma^\mu \, \psi_{,\mu} \end{aligned}

Here the first equation is the general formula (47) for the Euler-Lagrange variation, while the identity under the braces combines two facts (as in remark 17 above):

  1. the symmetry (12) of the spinor pairing ()¯γ μ()\overline{(-)}\gamma^\mu(-) (prop. 16);

  2. the anti-commutativity (44) of the Dirac field and jet coordinates, due to their supergeometric nature (remark 10).

Finally in the special case of the massive Dirac field in spacetime dimension p+1=4p+1 = 4 the Lagrangian function is

L=iψ¯γ μψ ,μ+mψ¯ψ L \;=\; i \, \overline{\psi} \gamma^\mu \psi_{,\mu} + m \overline{\psi}\psi

where now ψ α\psi_\alpha takes values in the complex numbers \mathbb{C} (as opposed to in \mathbb{R}, \mathbb{H} or 𝕆\mathbb{O}). Therefore we may now form the derivative equivalently by treeating ψ\psi and ψ¯\overline{\psi} as independent components of the field. This immediately yields the claim.

Example

(trivial Lagrangian densities and the Euler-Lagrange complex)

If a Lagrangian density L\mathbf{L} (def. 39) is in the image of the total spacetime derivative, hence horizontally exact (def. 59)

L=d \mathbf{L} \;=\; d \mathbf{\ell}

for any Ω Σ p,0(E)\mathbf{\ell} \in \Omega^{p,0}_\Sigma(E), then both its Euler-Lagrange form as well as its presymplectic current (def. 22) vanish:

δ ELL=0AA,AAΩ BFV=0. \delta_{EL}\mathbf{L} = 0 \phantom{AA} \,, \phantom{AA} \Omega_{BFV} = 0 \,.

This is because with δd=dδ\delta \circ d = - d \circ \delta (37) the defining unique decomposition (46) of δL\delta \mathbf{L} is given by

δL =δd =0=δ ELLdδlΘ BFV \begin{aligned} \delta \mathbf{L} & = \delta d \mathbf{\ell} \\ & = \underset{= \delta_{EL}\mathbf{L}}{\underbrace{0}} - d \underset{\Theta_{BFV}}{\underbrace{\delta \mathbf{l}}} \end{aligned}

which then implies with (52) that

Ω BFV δΘ BFV =δδ =0 \begin{aligned} \Omega_{BFV} & \coloneqq \delta \Theta_{BFV} \\ & = \delta \delta \mathbf{\ell} \\ & = 0 \end{aligned}

Therefore the Lagrangian densities which are total spacetime derivatives are also called trivial Lagrangian densities.

If the field bundle EfbΣE \overset{fb}{\to} \Sigma is a trivial vector bundle (example 9) over Minkowski spacetime (def. 23) then also the converse is true: Every Lagrangian density whose Euler-Lagrange form vanishes is a total spacetime derivative.

Stated more abstractly, this means that the exact sequence of the total spacetime from prop. 21 extends to the right via the Euler-Lagrange variational derivative δ EL\delta_{EL} to an exact sequence of the form

Ω Σ 0,0(E)dΩ Σ 1,0(E)dΩ Σ 2,0(E)ddΩ Σ p,0(E)dΩ Σ p+1,0(E)δ ELΩ Σ p+1,0(E)δ(C (E))δ H. \mathbb{R} \overset{}{\hookrightarrow} \Omega^{0,0}_\Sigma(E) \overset{d}{\longrightarrow} \Omega^{1,0}_\Sigma(E) \overset{d}{\longrightarrow} \Omega^{2,0}_\Sigma(E) \overset{d}{\longrightarrow} \cdots \overset{d}{\longrightarrow} \Omega^{p,0}_\Sigma(E) \overset{d}{\longrightarrow} \Omega^{p+1,0}_\Sigma(E) \overset{\delta_{EL}}{\longrightarrow} \Omega^{p+1,0}_\Sigma(E) \wedge \delta(C^\infty(E)) \overset{\delta_{H}}{\longrightarrow} \cdots \,.

In fact, as shown, this exact sequence keeps going to the right; this is also called the Euler-Lagrange complex.

(Anderson 89, theorem 5.1)

The next differential δ H\delta_{H} after the Euler-Lagrange variational derivative δ EL\delta_{EL} is known as the Helmholtz operator. By definition of exact sequence, the Helmholtz operator detects whether a partial differential equation on field histories, induced by a variational differential form PΩ Σ p+1,0(E)δ(C (E))P \in \Omega^{p+1,0}_\Sigma(E) \wedge \delta(C^\infty(E)) as in (61) comes from varying a Lagrangian density, hence whether it is the equation of motion of a Lagrangian field theory via def. 61.

This way homological algebra is brought to bear on core questions of field theory. For more on this see the exposition at Higher Structures in Physics.

Remark

(supergeometric nature of Lagrangian density of the Dirac field)

Observe that the Lagrangian density for the Dirac field (def. 43) makes sense (only) due to the supergeometric nature of the Dirac field (remark 10): If the field jet coordinates ψ ,μ 1μ k\psi_{,\mu_1 \cdots \mu_k} were not anti-commuting (44) then the Dirac’s field Lagrangian density (def. 43) would be a total spacetime derivative and hence be trivial according to example 50.

This is because

d(12ψ¯γ μψι μdvol Σ)=12ψ ,μ¯γ μψdvol Σ+12ψ¯γ μψ ,μdvol Σ=(1)12ψ ,μ¯γ μψdvol Σ. d \left( \tfrac{1}{2} \overline{\psi} \,\gamma^\mu\, \psi \, \iota_{\partial_\mu} dvol_\Sigma \right) = \tfrac{1}{2} \overline{\psi_{,\mu}} \,\gamma^\mu\, \psi \, dvol_\Sigma + \underset{ = (-1) \tfrac{1}{2} \overline{\psi_{,\mu}} \,\gamma^\mu\, \psi \, dvol_\Sigma }{ \underbrace{ \tfrac{1}{2}\overline{\psi} \,\gamma^\mu\, \psi_{,\mu} \, dvol_\Sigma }} \,.

Here the identification under the brace uses two facts:

  1. the symmetry (12) of the spinor bilinear pairing ()¯Γ()\overline{(-)}\Gamma (-);

  2. the anti-commutativity (44) of the Dirac field and jet coordinates, due to their supergeometric nature (remark 10).

The second fact gives the minus sign under the brace, which makes the total expression vanish, if the Dirac field and jet coordinates indeed are anti-commuting (which, incidentally, means that we found an “off-shell conserved current” for the Dirac field, see example 55 below).

If however the Dirac field and jet coordinates did commute with each other, we would instead have a plus sign under the brace, in which case the total horizontal derivative expression above would equal the massless Dirac field Lagrangian (45), thus rendering it trivial in the sense of example 50.

The same supergeometric nature of the Dirac field will be necessary for its intended equation of motion, the Dirac equation (example 52) to derive from a Lagrangian density; see the proof of example 49 below, and see remark 27 below.

\,

The key implication of the Euler-Lagrange form on the jet bundle is that it induces the equation of motion on the space of field histories:

Definition

(Euler-Lagrange equation of motion)

Given a Lagrangian field theory (E,L)(E,\mathbf{L}) (def. 60 then the corresponding Euler-Lagrange equations of motion is the condition on field histories (def. 33)

Φ ():UΓ Σ(E) \Phi_{(-)} \;\colon\; U \longrightarrow \Gamma_\Sigma(E)

to have a jet prolongation (def. 55)

j Σ (Φ ()()):U×ΣJ Σ (E) j^\infty_\Sigma(\Phi_{(-)}(-) ) \;\colon\; U \times \Sigma \longrightarrow J^\infty_\Sigma(E)

that factors through the shell inclusion i J Σ (E)\mathcal{E} \overset{i_{\mathcal{E}}}{\hookrightarrow} J^\infty_\Sigma(E) (48) defined by vanishing of the Euler-Lagrange form (prop. 22)

(61)j Σ (Φ ()()):U×Σi J Σ (E). j^\infty_\Sigma(\Phi_{(-)}(-)) \;\colon\; U \times \Sigma \longrightarrow \mathcal{E} \overset{i_{\mathcal{E}}}{\hookrightarrow} J^\infty_\Sigma(E) \,.

(This implies that j Σ (Φ ())j^\infty_\Sigma(\Phi_{(-)}) factors even through the prolonged shell i J Σ (E)\mathcal{E}^\infty \overset{i_{\mathcal{E}^\infty}}{\hookrightarrow} J^\infty_\Sigma(E) (49).)

In the case that the field bundle is a trivial vector bundle over Minkowski spacetime as in example 9 this is the condition that Φ ()\Phi_{(-)} satisfies the following differential equation (again using prop. 22):

(Lϕ addx μLϕ ,μ a+d 2dx μdx νLϕ ,μν a)((x μ),(Φ a),(Φ () ax μ),( 2Φ () ax μx ν),)=0. \left( \frac{\partial L}{\partial \phi^a} - \frac{d}{d x^\mu} \frac{\partial L}{\partial \phi^a_{,\mu}} + \frac{d^2}{d x^\mu d x^\nu} \frac{\partial L}{\partial \phi^a_{,\mu\nu}} - \cdots \right) \left( (x^\mu), (\Phi^a), \left( \frac{\partial \Phi^a_{(-)}}{\partial x^\mu}\right), \left( \frac{\partial^2 \Phi^a_{(-)}}{\partial x^\mu \partial x^\nu} \right), \cdots \right) \;=\; 0 \,.

The on-shell space of field histories is the space of solutions to this condition, namely the the sub-super smooth set (def. 48) of the full space of field histories (22) (def. 33)

(62)Γ Σ(E) δ ELL=0AAAΓ Σ(E) \Gamma_\Sigma(E)_{\delta_{EL} L = 0} \overset{\phantom{AAA}}{\hookrightarrow} \Gamma_\Sigma(E)

whose plots are those Φ ():UΓ Σ(E)\Phi_{(-)} \colon U \to \Gamma_\Sigma(E) that factor through the shell (61).

More generally for Σ rΣ\Sigma_r \hookrightarrow \Sigma a submanifold of spacetime, we write

(63)Γ Σ r(E) δ ELL=0AAAΓ Σ r(E) \Gamma_{\Sigma_r}(E)_{\delta_{EL} L = 0} \overset{\phantom{AAA}}{\hookrightarrow} \Gamma_{\Sigma_r}(E)

for the sub-super smooth ste of on-shell field histories restricted to the infinitesimal neighbourhood of Σ r\Sigma_r in Σ\Sigma (25).

Definition

(free field theory)

A Lagrangian field theory (E,L)(E, \mathbf{L}) (def. 60) with field bundle EfbΣE \overset{fb}{\to} \Sigma a vector bundle (e.g. a trivial vector bundle as in example 9) is called a free field theory if its Euler-Lagrange equations of motion (def. 61) is a differential equation that is linear differential equation, in that with

Φ 1,Φ 2Γ Σ(E) δ ELL=0 \Phi_1, \Phi_2 \;\in\; \Gamma_\Sigma(E)_{\delta_{EL}\mathbf{L} = 0}

any two on-shell field histories (62) and c 1,c 2c_1, c_2 \in \mathbb{R} any two real numbers, also the linear combination

c 1Φ 1+c 2Φ 2Γ Σ(E), c_1 \Phi_1 + c_2 \Phi_2 \;\in\; \Gamma_\Sigma(E) \,,

which a priori exists only as an element in the off-shell space of field histories, is again a solution to the equations of motion and hence an element of Γ Σ(E) δ ELL=0\Gamma_\Sigma(E)_{\delta_{EL}\mathbf{L} = 0}.

A Lagrangian field theory which is not a free field theory is called an interacting field theory.

Remark

(relevance of free field theory)

In perturbative quantum field theory one considers interacting field theories in the infinitesimal neighbourhood (example 27) of free field theories (def. 62) inside some super smooth set of general Lagrangian field theories. While free field theories are typically of limited interest in themselves, this perturbation theory around them exhausts much of what is known about quantum field theory in general, and therefore free field theories are of paramount importance for the general theory.

We discuss the covariant phase space of free field theories below in Propagators and their quantization below in Free quantum fields.

Example

(equation of motion of free real scalar field is Klein-Gordon equation)

Consider the Lagrangian field theory of the free real scalar field from example 39.

By example 45 its Euler-Lagrange form is

δ ELL=(η μνϕ ,μνm 2)δϕdvol σ \delta_{EL}\mathbf{L} \;=\; \left(\eta^{\mu \nu} \phi_{,\mu \nu} - m^2 \right) \delta \phi \wedge dvol_\sigma

Hence for ΦΓ Σ(E)=C (X)\Phi \in \Gamma_\Sigma(E) = C^\infty(X) a field history, its Euler-Lagrange equation of motion according to def. 61 is

η μν 2x μx νΦm 2Φ=0 \eta^{\mu \nu} \frac{\partial^2 }{\partial x^\mu \partial x^\nu} \Phi - m^2 \Phi \;=\; 0

often abbreviated as

(64)(m 2)Φ=0. (\Box - m^2) \Phi \;=\; 0 \,.

This PDE is called the Klein-Gordon equation on Minowski spacetime. If the mass mm vanishes, m=0m = 0, then this is the relativistic wave equation.

Hence this is indeed a free field theory according to def. 62.

The corresponding linear differential operator (def. 56)

(65)(m 2):Γ Σ(Σ×)Γ Σ(Σ×) (\Box - m^2) \;\colon\; \Gamma_\Sigma(\Sigma \times \mathbb{R}) \longrightarrow \Gamma_\Sigma(\Sigma \times \mathbb{R})

is called the Klein-Gordon operator.

For later use we record the following basic fact about the Klein-Gordon equation:

Example

(Klein-Gordon operator is formally self-adjoint )

The Klein-Gordon operator (65) is its own formal adjoint (def. 58) witnessed by the bilinear differential operator (33) given by

(66)K(Φ 1,Φ 2)(Φ 1x μΦ 2Φ 1Φ 2x μ)η μνι νdvol Σ. K(\Phi_1, \Phi_2) \;\coloneqq\; \left( \frac{\partial \Phi_1}{\partial x^\mu} \Phi_2 - \Phi_1 \frac{\partial \Phi_2}{\partial x^\mu} \right) \eta^{\mu \nu}\iota_{\partial_\nu} dvol_\Sigma \,.
Proof
dK(Φ 1,Φ 2) =d(Φ 1x μΦ 2Φ 1Φ 2x μ)η μνι νdvol Σ =((η μν 2Φ 1x μx νΦ 2+η μνΦ 1x μΦ 2x ν)(η μνΦ 1x νΦ 2x μ+Φ 1η μν 2Φ 2x νx μ))dvol Σ =(η μν 2Φ 1x μx νΦ 2Φ 1η μν 2Φ 2x νx μ)dvol Σ =(Φ 1)Φ 2Φ 1(Φ 2) \begin{aligned} d K(\Phi_1, \Phi_2) & = d \left( \frac{\partial \Phi_1}{\partial x^\mu} \Phi_2 - \Phi_1 \frac{\partial \Phi_2}{\partial x^\mu} \right) \eta^{\mu \nu}\iota_{\partial_\nu} dvol_\Sigma \\ &= \left( \left( \eta^{\mu \nu}\frac{\partial^2 \Phi_1}{\partial x^\mu \partial x^\nu} \Phi_2 + \eta^{\mu \nu} \frac{\partial \Phi_1}{\partial x^\mu} \frac{\partial \Phi_2}{\partial x^\nu} \right) - \left( \eta^{\mu \nu} \frac{\partial \Phi_1}{\partial x^\nu} \frac{\partial \Phi_2}{\partial x^\mu} + \Phi_1 \eta^{\mu \nu} \frac{\partial^2 \Phi_2}{\partial x^\nu \partial x^\mu} \right) \right) dvol_\Sigma \\ & = \left( \eta^{\mu \nu}\frac{\partial^2 \Phi_1}{\partial x^\mu \partial x^\nu} \Phi_2 - \Phi_1 \eta^{\mu \nu} \frac{\partial^2 \Phi_2}{\partial x^\nu \partial x^\mu} \right) dvol_\Sigma \\ & = \Box(\Phi_1) \Phi_2 - \Phi_1 \Box (\Phi_2) \end{aligned}
Example

(equations of motion of vacuum electromagnetism are vacuum Maxwell's equations)

Consider the Lagrangian field theory of free electromagnetism on Minkowski spacetime from example 40.

By example 46 its Euler-Lagrange form is

δ ELL=ddx μf μνδa ν. \delta_{EL}\mathbf{L} \;=\; \frac{d}{d x^\mu}f^{\mu \nu} \delta a_\nu \,.

Hence for AΓ Σ(T *Σ)=Ω 1(Σ)A \in \Gamma_{\Sigma}(T^\ast \Sigma) = \Omega^1(\Sigma) a field history (“vector potential”), its Euler-Lagrange equation of motion according to def. 61 is

x μF μν=0 d ηF=0, \begin{aligned} & \frac{\partial}{\partial x^\mu} F^{\mu \nu} = 0 \\ \Leftrightarrow\;\; & d \star_\eta F = 0 \end{aligned} \,,

where F=dAF = d A is the Faraday tensor (20). (In the coordinate-free formulation in the second line “ η\star_\eta” denotes the Hodge star operator induced by the pseudo-Riemannian metric η\eta on Minkowski spacetime.)

These PDEs are called the vacuum Maxwell's equations.

This, too, is a free field theory according to def. 62.

Example

(equation of motion of Dirac field is Dirac equation)

Consider the Lagrangian field theory of the Dirac field on Minkowski spacetime from example 43, with field fiber the spin representation SS regarded as a superpoint S oddS_{odd} and Lagrangian density given by the spinor bilinear pairing

L=iψ¯γ μ μψ+mψ¯ψ L \;=\; i \overline{\psi} \gamma^\mu \partial_\mu \psi + m \overline{\psi}\psi

(in spacetime dimension p+1{3,4,6,10}p+1 \in \{3,4,6,10\} with m=0m = 0 unless p+1=4p+1 = 4).

From example 49 it follows that the corresponding Euler-Lagrange equation of motion (def. 61) is

(67)(iγ μ μ+m)ψ=0. \left(-i \gamma^\mu \partial_\mu + m\right)\psi \;=\; 0 \,.

This is the Dirac equation. In terms of the Feynman slash notation from (16) the corresponding differential operator, the Dirac operator reads

(i/+m)ψ=0. \left( - i \partial\!\!\!/\, + m \right) \psi = 0 \,.

Hence this is a free field theory according to def. 62.

Observe that the “square” of the Dirac operator is the Klein-Gordon operator m 2\Box - m^2 (64)

(+iγ μ μ+m)(iγ μ μ+m)ψ =( μ μm 2)ψ =(m 2)ψ. \begin{aligned} \left( +i \gamma^\mu \partial_\mu + m \right) \left(-i \gamma^\mu \partial_\mu + m\right)\psi & = \left(\partial_\mu \partial^\mu - m^2\right) \psi \\ & = \left(\Box - m^2\right) \psi \end{aligned} \,.

This means that a Dirac field which solves the Dirac equations is in particular (on Minkowski spacetime) componentwise a solution to the Klein-Gordon equation.

Remark

(supergeometric nature of the Dirac equation as an Euler-Lagrange equation)

While the Dirac equation (67) of example 52 would make sense in itself also if the field coordinates ψ\psi and jet coordinates ψ ,μ\psi_{,\mu} of the Dirac field were not anti-commuting (44), due to their supergeometric nature (remark 10), it would, by remark 17, then no longer be the Euler-Lagrange equation of a Lagrangian density, hence then Dirac field theory would not be a Lagrangian field theory.

Example

(Dirac operator on Dirac spinors is formally self-adjoint differential operator)

The Dirac operator, hence the differential operator corresponding to the Dirac equation of example 52 via def. 56 is a formally anti-self adjoint (def. 58):

D *=D. D^\ast = - D \,.
Proof

Regard the Dirac operator as taking values in the dual spin bundle by using the Dirac conjugate ()¯\overline{(-)} (14):

Γ Σ(Σ×S) D Γ Σ(Σ×S *) Ψ ()¯γ μ μΨ \array{ \Gamma_\Sigma(\Sigma \times S) &\overset{D}{\longrightarrow}& \Gamma_\Sigma(\Sigma \times S^\ast) \\ \Psi &\mapsto& \overline{(-)} \gamma^\mu \partial_\mu \Psi }

Then we need to show that there is K(,)K(-,-) such that for all pairs of spinor sections Ψ 1,Ψ 2\Psi_1, \Psi_2 we have

Ψ 2¯γ μ( μΨ 1)Ψ 1¯γ μ( μΨ 2)=dK(ψ 1,ψ 2). \overline{\Psi_2}\gamma^\mu (\partial_\mu \Psi_1) - \overline{\Psi_1}\gamma^\mu (-\partial_\mu \Psi_2) \;=\; d K(\psi_1, \psi_2) \,.

But the spinor-to-vector pairing is symmetric (12), hence this is equivalent to

μΨ 1¯γ μΨ 2+Ψ 1¯γ μ( μΨ 2)=dK(ψ 1,ψ 2). \overline{\partial_\mu \Psi_1}\gamma^\mu \Psi_2 + \overline{\Psi_1}\gamma^\mu (\partial_\mu \Psi_2) \;=\; d K(\psi_1, \psi_2) \,.

By the product law of differentiation, this is solved, for all Ψ 1,Ψ 2\Psi_1, \Psi_2, by

K(Ψ 1,Ψ 2)(Ψ 1¯γ μΨ 2)ι μdvol. K(\Psi_1, \Psi_2) \;\coloneqq\; \left( \overline{\Psi_1} \gamma^\mu \Psi_2\right) \, \iota_{\partial_\mu} dvol \,.

\,

This concludes our discussion of Lagrangian densities and their variational calculus. In the next chapter we consider the infinitesimal symmetries of Lagrangians.

\,

Symmetries

We have introduced the concept of Lagrangian field theories (E,L)(E,\mathbf{L}) in terms of a field bundle EE equipped with a Lagrangian density L\mathbf{L} on its jet bundle (def. 60). Generally, given any object equipped with some structure, it is of paramount interest to determine the symmetries, hence the isomorphisms/equivalences of the object that preserve the given structure (this is the “Erlanger program”, Klein 1872).

The infinitesimal symmetries of the Lagrangian density (def. 66 below) send one field history to an infinitesimally nearby one which is “equivalent” for all purposes of field theory. Among these are the infinitesimal gauge symmetries which will be of concern below. A central theorem of variational calculus says that infinitesimal symmetries of the Lagrangian correspond to conserved currents, this is Noether's theorem I, prop. 30 below. These conserved currents constitute an extension of the Lie algebra of symmetries, called the Dickey bracket.

But in (54) we have seen that the Lagrangian density of a Lagrangian field theory is just one component, in codimension 0, of an inhomogeneous “Lepage form” which in codimension 1 is given by the presymplectic potential current Θ BFV\Theta_{BFV} (50). (This will be conceptually elucidated, after we have introduced the local BV-complex, in example 75 below.) This means that in codimension 1 we are to consider infinitesimal on-shell symmetries of the Lepage form L+Θ BFV\mathbf{L} + \Theta_{BFV}. These are known as Hamiltonian vector fields (def. 70 below) and the analog of Noether's theorem I now says that these correspond to Hamiltonian differential forms. The Lie algebra of these infinitesimal symmetries is called the local Poisson bracket (prop. 36 below).

Noether theorem and Hamiltonian Noether theorem

\, variational form \,\, symmetry \,\, homotopy formula \,\, physical quantity \,\,\,\, local symmetry algebra \,
Lagrangian density L\mathbf{L}
(def. 60)
vL=dJ˜\mathcal{L}_v \mathbf{L} = d \tilde Jd(J˜ι vΘ BFV=J v)=ι vδ ELL d(\underset{= J_v}{\underbrace{\tilde J - \iota_v \Theta_{BFV}}}) = \iota_v \, \delta_{EL}\mathbf{L}conserved current J vJ_v
(def. 66)
Dickey bracket
presymplectic current Ω BFV\Omega_{BFV}
(prop. 22)
v varΘ BFV=δH˜\mathcal{L}^{var}_v \Theta_{BFV} = \delta \tilde Hδ(H˜ vι vΘ BFV=H v)=ι vΩ BFV\delta(\underset{= H_v}{\underbrace{\tilde H_v - \iota_v \Theta_{BFV}}}) = \iota_v \Omega_{BFV}Hamiltonian form H vH_v
(def. 70)
local Poisson bracket
(prop. 36)

\,

In Phase space below we transgress this local Poisson bracket of infinitesimal symmetries of the presymplectic potential current to the “global” Poisson bracket on the covariant phase space (def. 90 below). This is the structure which then further below leads over to the quantization (deformation quantization) of the prequantum field theory to a genuine perturbative quantum field theory. However, it will turn out that there may be an obstruction to this construction, namely the existence of special infinitesimal symmetries of the Lagrangian densities, called implicit gauge symmetries (discussed further below).

\,

We now discuss these topics:

\,

\,

infinitesimal symmetries of the Lagrangian density

Definition

(variation)

Let EfbΣE \overset{fb}{\to} \Sigma be a field bundle (def. 34).

A variation is a vertical vector field vv on the jet bundle J Σ (E)J^\infty_\Sigma(E) (def. 54) hence a vector field which vanishes when evaluated in the horizontal differential forms.

In the special case that the field bundle is trivial vector bundle over Minkowski spacetime as in example 9, a variation is of the form

v=v a ϕ a+v ,μ a ϕ ,μ a+v μ 1μ 2 a ϕ μ 1μ 2 a+ v = v^a \partial_{\phi^a} + v^a_{,\mu} \partial_{\phi^a_{,\mu}} + v^a_{\mu_1 \mu_2} \partial_{\phi^a_{\mu_1 \mu_2}} + \cdots

The concept of variation in def. 63 is very general, in that it allows to vary the field coordinates independently from the corresponding jets. This generality is necessary for discussion of symmetries of presymplectic currents in def. 70 below. But for discussion of symmetries of Lagrangian densities we are interested in explicitly varying just the field coordinates (def. 64 below) and inducing from this the corresponding variations of the field derivatives (prop. 28) below.

In order to motivate the following definition 64 of evolutionary vector fields we follow remark 12 saying that concepts in variational calculus are obtained from their analogous concepts in plain differential calculus by replacing plain bundle morphisms by morphism out of the jet bundle:

Given a fiber bundle EfbΣE \overset{fb}{\to} \Sigma, then a vertical vector field on EE is a section of its vertical tangent bundle T ΣET_\Sigma E (def. 6), hence is a bundle morphism of this form

E vertical vector field T ΣE id E \array{ E && \overset{\text{vertical vector field}}{\longrightarrow} && T_\Sigma E \\ & {}_{\mathllap{id}}\searrow && \swarrow \\ && E }

The variational version replaces the vector bundle on the left with its jet bundle:

Definition

(evolutionary vector fields)

Let EfbΣE \overset{fb}{\to} \Sigma be a field bundle (def. 34). Then an evolutionary vector field vv on EE is “variational vertical vector field” on EE, hence a smooth bundle homomorphism out of the jet bundle (def. 54)

J Σ E v T ΣE jb ,0 E \array{ J^\infty_\Sigma E && \overset{v}{\longrightarrow} && T_\Sigma E \\ & {}_{\mathllap{jb_{\infty,0}}}\searrow && \swarrow_{\mathllap{}} \\ && E }

to the vertical tangent bundle T ΣEΣT_\Sigma E \overset{}{\to} \Sigma (def. 6) of EfbΣE \overset{fb}{\to} \Sigma.

In the special case that the field bundle is a trivial vector bundle over Minkowski spacetime as in example 9, this means that an evolutionary vector field is a tangent vector field (example 5) on J Σ (E)J^\infty_\Sigma(E) of the special form

v =v a ϕ a =v a((x μ),(ϕ a),(ϕ ,μ a),) ϕ a, \begin{aligned} v & = v^a \partial_{\phi^a} \\ & = v^a\left( (x^\mu), (\phi^a), (\phi^a_{,\mu}), \cdots \right) \partial_{\phi^a} \end{aligned} \,,

where the coefficients v aC (J Σ (E))v^a \in C^\infty(J^\infty_\Sigma(E)) are general smooth functions on the jet bundle (while the cmponents are tangent vectors along the field coordinates (ϕ a)(\phi^a), but not along the spacetime coordinates (x μ)(x^\mu) and not along the jet coordinates ϕ ,μ 1μ k a\phi^a_{,\mu_1 \cdots \mu_k}).

We write

Γ E ev(T ΣE)Ω Σ 0,0(E)Mod \Gamma_E^{ev}\left( T_\Sigma E \right) \;\in\; \Omega^{0,0}_\Sigma(E) Mod

for the space of evolutionary vector fields, regarded as a module over the \mathbb{R}-algebra

Ω Σ 0,0(E)=C (J Σ (E)) \Omega^{0,0}_\Sigma(E) \;=\; C^\infty\left( J^\infty_\Sigma(E) \right)

of smooth functions on the jet bundle.

An evolutionary vector field (def. 64) describes an infinitesimal change of field values depending on, possibly, the point in spacetime and the values of the field and all its derivatives (locally to finite order, by prop. 19).

This induces a corresponding infinitesimal change of the derivatives of the fields, called the prolongation of the evolutionary vector field:

Proposition

(prolongation of evolutionary vector field)

Let EfbΣE \overset{fb}{\to} \Sigma be a fiber bundle.

Given an evolutionary vector field vv on EE (def. 64) there is a unique tangent vector field v^\hat v (example 5) on the jet bundle J Σ (E)J^\infty_\Sigma(E) (def. 54) such that

  1. v^\hat v agrees on field coordinates (as opposed to jet coordinates) with vv:

    (jb ,0) *(v^)=v, (jb_{\infty,0})_\ast(\hat v) = v \,,

    which means in the special case that EfbΣE \overset{fb}{\to} \Sigma is a trivial vector bundle over Minkowski spacetime (example 9) that v^\hat v is of the form

    (68)v^=v a ϕ a=v+v^ μ a ϕ ,μ a+v^ μ 1μ 2 a ϕ ,μ 1μ 2 a+ \hat v \;=\; \underset{ = v }{ \underbrace{ v^a \partial_{\phi^a} }} \,+\, \hat v^a_{\mu} \partial_{\phi^a_{,\mu}} + \hat v^a_{\mu_1 \mu_2} \partial_{\phi^a_{,\mu_1 \mu_2}} + \cdots
  2. contraction with v^\hat v (def. 13) anti-commutes with the total spacetime derivative (def. 59):

    (69)ι v^d+dι v^=0. \iota_{\hat v} \circ d + d \circ \iota_{\hat v} = 0 \,.

In particular Cartan's homotopy formula (prop. 3) for the Lie derivative v^\mathcal{L}_{\hat v} holds with respect to the variational derivative δ\delta:

(70) v^=δι v^+ι v^δ \mathcal{L}_{\hat v} = \delta \circ \iota_{\hat v} + \iota_{\hat v} \circ \delta

Explicitly, in the special case that the field bundle is a trivial vector bundle over Minkowski spacetime (example 9) v^\hat v is given by

(71)v^=n=0d nv adx μ 1dx μ n ϕ μ 1μ n a. \hat v = \underoverset{n = 0}{\infty}{\sum} \frac{d^n v^a}{ d x^{\mu_1} \cdots d x^{\mu_n} } \partial_{\phi^a_{\mu_1 \cdots \mu_n}} \,.
Proof

It is sufficient to prove the coordinate version of the statement. We prove this by induction over the maximal jet order kk. Notice that the coefficient of ϕ μ 1μ k a\partial_{\phi^a_{\mu_1 \cdots \mu_k}} in v^\hat v is given by the contraction ι v^δϕ μ 1μ k a\iota_{\hat v} \delta \phi^a_{\mu_1 \cdots \mu_k} (def. 13).

Similarly (at “k=1k = -1”) the component of μ 1\partial_{\mu_1} is given by ι v^dx μ\iota_{\hat v} d x^{\mu}. But by the second condition above this vanishes:

ι v^dx μ =dι v^x μ =0 \begin{aligned} \iota_{\hat v} d x^\mu & = d \iota_{\hat v} x^\mu \\ & = 0 \end{aligned}

Moreover, the coefficient of ϕ a\partial_{\phi^a} in v^\hat v is fixed by the first condition above to be

ι v^δϕ a=v a. \iota_{\hat v} \delta \phi^a = v^a \,.

This shows the statement for k=0k = 0. Now assume that the statement is true up to some kk \in \mathbb{N}. Observe that the coefficients of all ϕ μ 1μ k+1 a\partial_{\phi^a_{\mu_1 \cdots \mu_{k+1}}} are fixed by the contractions with δϕ μ 1μ kμ k+1 adx μ k+1\delta \phi^a_{\mu_1 \cdots \mu_{k} \mu_{k+1}} \wedge d x^{\mu_{k+1}}. For this we find again from the second condition and using δd+dδ=0\delta \circ d + d \circ \delta = 0 as well as the induction assumption that

ι v^δϕ μ 1μ k+1 adx μ k+1 =ι v^δdϕ μ 1μ k a =dι v^δϕ μ 1μ k a =dd kv adx μ 1dx μ k =d k+1v adx μ 1dx μ k+1dx μ k+1. \begin{aligned} \iota_{\hat v} \delta \phi^a_{\mu_1 \cdots \mu_{k+1}} \wedge d x^{\mu_{k+1}} & = \iota_{\hat v} \delta d \phi^a_{\mu_1 \cdots \mu_k} \\ & = d \iota_{\hat v} \delta \phi^a_{\mu_1 \cdots \mu_k} \\ & = d \frac{d^k v^a}{d x^{\mu_1} \cdots d x^{\mu_k}} \\ & = \frac{d^{k+1}v^a }{d x^{\mu_1} \cdots d x^{\mu_{k+1}}} d x^{\mu_{k+1}} \,. \end{aligned}

This shows that v^\hat v satisfying the two conditions given exists uniquely.

Finally formula (70) for the Lie derivative follows from the second of the two conditions with Cartan's homotopy formula v^=dι v^+ι v^d\mathcal{L}_{\hat v} = \mathbf{d} \circ \iota_{\hat v} + \iota_{\hat v} \circ \mathbf{d} (prop. 3) together with d=δ+d\mathbf{d} = \delta + d (35).

Proposition

(evolutionary vector fields form a Lie algebra)

Let EfbΣE \overset{fb}{\to} \Sigma be a fiber bundle. For any two evolutionary vector fields v 1v_1, v 2v_2 on EE (def. 64) the Lie bracket of tangent vector fields of their prolongations v^ 1\hat v_1, v^ 2\hat v_2 (def. 28) is itself the prolongation [v 1,v 2]^\widehat{[v_1, v_2]} of a unique evolutionary vector field [v 1,v 2][v_1,v_2].

This defines the structure of a Lie algebra on evolutionary vector fields.

Proof

It is clear that [v^ 1,v^ 2][\hat v_1, \hat v_2] is still vertical, therefore, by prop. 28, it is sufficient to show that contraction ι [v 1,v 2]\iota_{[v_1, v_2]} with this vector field (def. 13) anti-commutes with the horizontal derivative dd, hence that [d,ι [v^ 1,v^ 2]]=0[d, \iota_{[\hat v_1, \hat v_2]}] = 0.

Now [d,ι [v^ 1,v^ 2]][d, \iota_{[\hat v_1, \hat v_2]}] is an operator that sends vertical 1-forms to horizontal 1-forms and vanishes on horizontal 1-forms. Therefore it is sufficient to see that this operator in fact also vanishes on all vertical 1-forms. But for this it is sufficient that it commutes with the vertical derivative. This we check by Cartan calculus, using [d,δ]=0[d,\delta] = 0 and [d,ι v^ i][d, \iota_{\hat v_i}], by assumption:

[δ,[d,ι [v^ 1,v^ 2]]] =[d,[δ,ι [v^ 1,v^ 2]]] =[d, [v^ 1,v^ 2]] =[d,[ v^ 1,ι v^ 2]] =[d,[[δ,ι v^ 1],ι v^ 2]] =0. \begin{aligned} {[ \delta, [ d,\iota_{[\hat v_1, \hat v_2]}] ]} & = - [d, [\delta, \iota_{[\hat v_1, \hat v_2]}]] \\ & = - [d, \mathcal{L}_{[\hat v_1, \hat v_2]}] \\ & = -[d, [\mathcal{L}_{\hat v_1}, \iota_{\hat v_2}] ] \\ & = - [d, [ [\delta, \iota_{\hat v_1}], \iota_{\hat v_2} ]] \\ & = 0 \,. \end{aligned}

Now given an evolutionary vector field, we want to consider the flow that it induces on the space of field histories:

Definition

(flow of field histories along evolutionary vector field)

Let EfbΣE \overset{fb}{\to} \Sigma be a field bundle (def. 34) and let vv be an evolutionary vector field (def. 64) such that the ordinary flow of its prolongation v^\hat v (prop. 28)

exp(tv^):J Σ (E)J Σ (E) \exp(t \hat v) \;\colon\; J^\infty_\Sigma(E) \longrightarrow J^\infty_\Sigma(E)

exists on the jet bundle (e.g. if the order of derivatives of field coordinates that it depends on is bounded).

For Φ ():U 1Γ Σ(E)\Phi_{(-)} \colon U_1 \to \Gamma_\Sigma(E) a collection of field histories (hence a plot of the space of field histories (def. 33) ) the flow of vv through Φ ()\Phi_{(-)} is the smooth function

U 1× 1exp(v)(Φ ())Γ Σ(E) U_1 \times \mathbb{R}^1 \overset{\exp(v)(\Phi_{(-)})}{\longrightarrow} \Gamma_\Sigma(E)

whose unique factorization exp(v)^(Φ ())\widehat{\exp(v)}(\Phi_{(-)}) through the space of jets of field histories (i.e. the image im(j Σ )im(j^\infty_\Sigma) of jet prolongation, def. 55)

im(j Σ ) Γ Σ(J Σ (E)) exp(v)^(Φ ()) U 1× 1 exp(v)(Φ) Γ Σ(E) \array{ && im(j^\infty_\Sigma) &\hookrightarrow& \Gamma_\Sigma(J^\infty_\Sigma(E)) \\ & {}^{\mathllap{\widehat{\exp(v)}(\Phi_{(-)})}} \nearrow& \downarrow^{\mathrlap{\simeq}} \\ U_1 \times \mathbb{R}^1 &\underset{ \exp(v)(\Phi) }{\longrightarrow}& \Gamma_{\Sigma}(E)_{} }

takes a plot t ():U 2 1t_{(-)} \;\colon\; U_2 \to \mathbb{R}^1 of the real line (regarded as a super smooth set via example 31), to the plot

(72)(exp(t()v^)j Σ (Φ ()):U 1×U 2Γ Σ(J Σ (E)) (\exp(t(-) \hat v) \circ j^\infty_\Sigma(\Phi_{(-)}) \;\colon\: U_1 \times U_2 \longrightarrow \Gamma_\Sigma\left( J^\infty_\Sigma(E) \right)

of the smooth space of sections of the jet bundle.

(That exp(t()v^)\exp(t(-) \hat v) indeed flows jet prolongations j Σ (Φ())j^\infty_\Sigma(\Phi(-)) again to jet prolongations is due to its defining relation to the evolutionary vector field vv from prop. 28.)

Definition

(infinitesimal symmetries of the Lagrangian and conserved currents)

Let (E,L)(E,\mathbf{L}) be a Lagrangian field theory (def. 60).

Then

  1. an infinitesimal symmetry of the Lagrangian is an evolutionary vector field vv (def. 64) such that the Lie derivative of the Lagrangian density along its prolongation v^\hat v (prop. 28) is a total spacetime derivative:

    v^L=dJ˜ v^ \mathcal{L}_{\hat v} \mathbf{L} \;=\; d \tilde J_{\hat v}
  2. an on-shell conserved current is a horizontal pp-form JΩ Σ p,0(E)J \in \Omega^{p,0}_\Sigma(E) (def. 59) whose total spacetime derivative vanishes on the prolonged shell (48)

    dJ| =0. d J\vert_{\mathcal{E}^\infty} \;=\; 0 \,.
Proposition

(Noether's theorem I)

Let (E,L)(E,\mathbf{L}) be a Lagrangian field theory (def. 60).

If vv is an infinitesimal symmetry of the Lagrangian (def. 66) with v^L=dJ˜ v^\mathcal{L}_{\hat v} \mathbf{L} = d \tilde J_{\hat v}, then

(73)J v^J˜ v^ι v^Θ BFV J_{\hat v} \coloneqq \tilde J_{\hat v} - \iota_{\hat v} \Theta_{BFV}

is an on-shell conserved current (def. 66), for Θ BFV\Theta_{BFV} a presymplectic potential (50) from def. 22.

(Noether's theorem II is prop. 78 below.)

Proof

By Cartan's homotopy formula for the Lie derivative (prop. 3) and the decomposition of the variational derivative δL\delta \mathbf{L} (46) and the fact that contraction ι v^\iota_{\hat v} with the prolongtion of an evolutionary vector field vanishes on horizontal differential forms (68) and anti-commutes with the horizontal differential (69), by def. 64, we may re-express the defining equation for the symmetry as follows:

dJ˜ v^ = v^L =ι v^dL=δ ELLdΘ BFV+dι vL=0 =ι v^δ ELL+dι v^Θ BFV \begin{aligned} d \tilde J_{\hat v} & = \mathcal{L}_{\hat v} \mathbf{L} \\ & = \iota_{\hat v} \underset{= \delta_{EL}\mathbf{L} - d \Theta_{BFV}}{\underbrace{\mathbf{d} \mathbf{L}}} + \mathbf{d} \underset{= 0}{\underbrace{\iota_v \mathbf{L}}} \\ & = \iota_{\hat v} \delta_{EL} \mathbf{L} + d \iota_{\hat v} \Theta_{BFV} \end{aligned}

which is equivalent to

d(J˜ v^ι v^Θ BFV=J v^)=ι v^δ ELL d(\underset{= J_{\hat v}}{\underbrace{\tilde J_{\hat v} - \iota_{\hat v} \Theta_{BFV}}}) = \iota_{\hat v} \delta_{EL}\mathbf{L}

Since, by definition of the shell \mathcal{E}, the differential form on the right vanishes on \mathcal{E} this yields the claim.

Example

(energy-momentum of the scalar field)

Consider the Lagrangian field theory of the free scalar field from def. 39:

L=12(η μνϕ ,μϕ ,νm 2ϕ 2)dvol Σ. \mathbf{L} \;=\; \tfrac{1}{2} \left( \eta^{\mu \nu}\phi_{,\mu} \phi_{,\nu} - m^2 \phi^2 \right) dvol_\Sigma \,.

For ν{0,1,,p}\nu \in \{0, 1, \cdots, p\} consider the vector field on the jet bundle given by

v νϕ ,ν ϕ+ϕ ,μν ϕ ,μ+. v_\nu \;\coloneqq\; \phi_{,\nu} \partial_{\phi} + \phi_{,\mu \nu} \partial_{\phi_{,\mu}} + \cdots \,.

This describes infinitesimal translations of the fields in the direction of ν\partial_\nu.

And this is an infinitesimal symmetry of the Lagrangian (def. 66), since

ι v νdL=dLι νdvol Σ. \iota_{v_\nu} \mathbf{d}\mathbf{L} = d L \wedge \iota_{\partial_\nu} dvol_\Sigma \,.

With the formula (60) for the presymplectic potential

Θ BFV=η μνϕ ,μδϕι νdvol Σ \Theta_{BFV} = \eta^{\mu \nu} \phi_{,\mu} \delta \phi \iota_{\partial_{\nu}} dvol_\Sigma

it hence follows from Noether's theorem (prop. 30) that the corresponding conserved current (def. 66) is

T ν =Lι νdvol Σι v νΘ BFV =Lι νdvol Ση ρμϕ ,ρϕ ,νι μdvol Σ =(δ ν μLη ρμϕ ,ρϕ ,ν=:T ν μ)ι μdvol Σ. \begin{aligned} T_\nu & = L \, \iota_{\partial_\nu} dvol_\Sigma - \iota_{v_\nu}\Theta_{BFV} \\ & = L \, \iota_{\partial_\nu} dvol_\Sigma - \eta^{\rho \mu} \phi_{,\rho} \phi_{,\nu} \, \iota_{\partial_\mu} dvol_\Sigma \\ & = ( \underset{=: T^\mu_\nu}{ \underbrace{ \delta^\mu_\nu L - \eta^{\rho \mu} \phi_{,\rho} \phi_{,\nu} } } ) \, \iota_{\partial_\mu} dvol_\Sigma \end{aligned} \,.

This conserved current is called the energy-momentum tensor.

Example

(Dirac current)

Consider the Lagrangian field theory of the free Dirac field on Minkowski spacetime in spacetime dimension p+1=3+1p + 1 = 3+1 (example 43)

L=iψ¯γ μψ ,μdvol Σ. \mathbf{L} = i \overline{\psi} \gamma^\mu \psi_{,\mu} \, dvol_\Sigma \,.

Then the prolongation (prop. 28) of the evolutionary vector field (def. 64)

viψ α ψ α v \;\coloneqq\; i \psi_\alpha \partial_{\psi_\alpha}

is an infinitesimal symmetry of the Lagrangian (def. 66). The conserved current that corresponds to this under Noether's theorem I (prop. 30) is

ψ¯γ μψι μdvol ΣΩ Σ(E) p,0. \overline{\psi} \gamma^\mu \psi \, \iota_{\partial_\mu} dvol_\Sigma \;\in\; \Omega^{p,0}_{\Sigma(E)} \,.

This is called the Dirac current.

In fact, due to the supergeometric nature of the Dirac field, the Dirac current is conserved even off-shell, as discussed in remark 17.

Proof

By equation (71) the prolongation of vv is

v^=iψ α ψ α+iψ α,μ ψ α,μ+. \hat v = i \psi_\alpha \partial_{\psi_\alpha} + i \psi_{\alpha,\mu} \partial_{\psi_{\alpha,\mu}} + \cdots \,.

Therefore

v^(iψ¯γ μψ ,μ)dvol Σ =iiψ¯γ μψ ,μ=i(i)ψ¯γ μψ ,μdvol Σ+iψ¯γ μ(iψ ,μ)iiψ¯γ μψ ,μdvol Σ =0 \begin{aligned} \mathcal{L}_{\hat v} \left( i \overline{\psi} \gamma^\mu \psi_{,\mu} \right) dvol_\Sigma & = \underset{ = i \cdot (-i) \overline{\psi} \gamma^\mu \psi_{,\mu} }{ \underbrace{ i \overline{i \psi} \gamma^\mu \psi_{,\mu} } } dvol_\Sigma + \underset{ i \cdot i \overline{\psi} \gamma^\mu \psi_{,\mu} }{ \underbrace{ i \overline{\psi} \gamma^\mu (i \psi_{,\mu}) } } dvol_\Sigma \\ & = 0 \end{aligned}

\,

Since an infinitesimal symmetry of a Lagrangian (def. 66) by definition changes the Lagrangian only up to a total spacetime derivative, and since the Euler-Lagrange equations of motion by construction depend on the Lagrangian density only up to a total spacetime derivative (prop. 22), it is plausible that and infinitesimal symmetry of the Lagrangian preserves the equations of motion (47), hence the shell (49). That this is indeed the case is the statement of prop. 33 below.

To make the proof transparent, we now first introduce the concept of the evolutionary derivative (def. 68) below and then observe that in terms of these the Euler-Lagrange derivative is in fact a derivation (prop. 31).

Definition

(field-dependent sections)

For

EfbΣ E \overset{fb}{\longrightarrow} \Sigma

a fiber bundle (def. 6), regarded as a field bundle (def. 34), and for

EfbΣ E' \overset{fb'}{\longrightarrow} \Sigma

any other fiber bundle over the same base space (spacetime), we write

Γ J Σ (E)(E)Γ J Σ (E)(jb *E)=Hom Σ(J Σ (E),E)DiffOp(E,E) \Gamma_{J^\infty_\Sigma(E)}(E') \;\coloneqq\; \Gamma_{J^\infty_\Sigma(E)}( jb^\ast E' ) \;=\; Hom_\Sigma(J^\infty_\Sigma(E), E') \;\simeq\; DiffOp(E,E')

for the space of sections of the pullback of bundles of EE' to the jet bundle J Σ (E)jbΣJ^\infty_\Sigma(E) \overset{jb}{\longrightarrow} \Sigma (def. 54) along jbjb.

Γ J Σ (E)(E)={ E fb J Σ (E) jb ΣA}. \Gamma_{J^\infty_\Sigma(E)}(E') \;=\; \left\{ \array{ && E' \\ & {}^{\mathllap{}}\nearrow & \downarrow \mathrlap{fb'} \\ J^\infty_\Sigma(E) &\underset{jb}{\longrightarrow}& \Sigma } \phantom{A}\,\, \right\} \,.

(Equivalently this is the space of differential operators from sections of EE to sections of EE', according to prop. 56. )

In (Olver 93, section 5.1, p. 288) the field dependent sections of def. 67, considered in local coordinates, are referred to as tuples of differential functions.

Example

(source forms and evolutionary vector fields are field-dependent sections)

For EfbΣE \overset{fb}{\to} \Sigma a field bundle, write T ΣET_\Sigma E for its vertical tangent bundle (example 6) and T Σ *ET_\Sigma^\ast E for its dual vector bundle (def. 8), the vertical cotangent bundle.

Then the field-dependent sections of these bundles according to def. 67 are identified as follows:

  • the space Γ J Σ (E)(T ΣE)\Gamma_{J^\infty_\Sigma(E)}(T_\Sigma E) contains the space of evolutionary vector fields vv (def. 64) as those bundle morphism which respect not just the projection to Σ\Sigma but also its factorization through EE:

    ( T ΣE v tb Σ J Σ (E) jb ,0 E fb Σ)Γ J Σ (E)(T ΣE) \left( \array{ && T_\Sigma E \\ & {}^{\mathllap{v}}\nearrow & \downarrow^{\mathrlap{tb_\Sigma}} \\ J^\infty_\Sigma(E) &\underset{jb_{\infty,0}}{\longrightarrow}& E & \underset{fb}{\longrightarrow}& \Sigma } \right) \;\in\; \Gamma_{J^\infty_\Sigma(E)}(T_\Sigma E)
  • Γ J Σ (E)(T Σ *E) Σ p+1(T *Σ)\Gamma_{J^\infty_\Sigma(E)}( T^\ast_\Sigma E) \otimes \wedge^{p+1}_\Sigma(T^\ast \Sigma) contains the space of source forms EE (prop. 22) as those bundle morphisms which respect not just the projection to Σ\Sigma but also its factorization through EE:

    ( T Σ *E E ctb Σ J Σ (E) jb ,0 E fb Σ)Γ J Σ (E)(T Σ *E) \left( \array{ && T^\ast_\Sigma E \\ & {}^{E}\nearrow & \downarrow^{\mathrlap{ctb_\Sigma}} \\ J^\infty_\Sigma(E) &\underset{jb_{\infty,0}}{\longrightarrow}& E & \underset{fb}{\longrightarrow}& \Sigma } \right) \;\in\; \Gamma_{J^\infty_\Sigma(E)}(T^\ast_\Sigma E)

This makes manifest the duality pairing between source forms and evolutionary vector fields

Γ J Σ (E)(T ΣE)Γ J Σ (E)(T Σ *E) C (J Σ (E)) \array{ \Gamma_{J^\infty_\Sigma(E)}(T_\Sigma E) \otimes \Gamma_{J^\infty_\Sigma(E)}(T^\ast_\Sigma E) &\longrightarrow& C^\infty(J^\infty_\Sigma(E)) }

which in local coordinates is given by

(v a ϕ a,ω aδϕ a)v aω a (v^a \partial_{\phi^a} \,,\, \omega_a \delta \phi^a) \mapsto v^a \omega_a

for v a,ω aC (J Σ (E))v^a, \omega_a \in C^\infty(J^\infty_\Sigma(E)) smooth functions on the jet bundle (as in prop. 19).

Definition

(evolutionary derivative of field-dependent section)

Let

EfbΣ E \overset{fb}{\to} \Sigma

be a fiber bundle regarded as a field bundle (def. 34) and let

VvbΣ V \overset{vb}{\to} \Sigma

be a vector bundle (def. 7). Then for

PΓ J Σ (E)(V) P \in \Gamma_{J^\infty_\Sigma(E)}(V)

a field-dependent section of EE according to def. 67, its evolutionary derivative is the morphism

Γ J Σ (E)(T ΣE) DP Γ J Σ (E)(V) v v^(P) \array{ \Gamma_{J^\infty_\Sigma(E)}(T_\Sigma E) & \overset{ \mathrm{D}P }{\longrightarrow} & \Gamma_{J^\infty_\Sigma(E)}(V) \\ v &\mapsto& \hat v(P) }

which, under the identification of example 56, sense an evolutionary vector field vv to the derivative of PP (example 5) along the prolongation tangent vector field v^\hat v of vv (prop. 28).

In the case that EE and VV are trivial vector bundles over Minkowski spacetime with coordinates ((x μ),(ϕ a))((x^\mu), (\phi^a)) and ((x μ),(ρ b))((x^\mu), (\rho^b)), respectively (example 9), then by (71) this is given by

((DP)(v)) b=(v aP bϕ a+dv adx μP bϕ ,μ a+d 2v adx μdx νP bϕ ,μν a+) ((\mathrm{D}P)(v))^b \;=\; \left( v^a \frac{\partial P^b}{\partial \phi^a} + \frac{d v^a}{d x^\mu} \frac{\partial P^b}{\partial \phi^a_{,\mu}} + \frac{d^2 v^a}{d x^\mu d x^\nu} \frac{\partial P^b}{\partial \phi^a_{,\mu \nu}} + \cdots \right)

This makes manifest that DP\mathrm{D}P may equivalently be regarded as a J Σ (E)J^\infty_\Sigma(E)-dependent differential operator (def. 56) from the vertical tangent bundle T ΣET_\Sigma E (def. 6) to VV, namely a bundle homomorphism over Σ\Sigma of the form

D P:J Σ (E)× ΣJ Σ T ΣEV \mathrm{D}_P \;\colon\; J^\infty_\Sigma(E) \times_\Sigma J^\infty_\Sigma T_\Sigma E \longrightarrow V

in that

(74)D P(,v)=DP(v)=v^(P). \mathrm{D}_P(-,v) = \mathrm{D}P(v) = \hat v (P) \,.

(Olver 93, def. 5.24)

Example

(evolutionary derivative of Lagrangian function)

Over Minkowski spacetime Σ\Sigma (def. 23), let L=LdvolΩ Σ p,0(E)\mathbf{L} = L dvol \in \Omega^{p,0}_\Sigma(E) be a Lagrangian density (def. 60), with coefficient function regarded as a field-dependent section (def. 67) of the trivial real line bundle:

LΓ J Σ (E)(Σ×), L \;\in \; \Gamma_{J^\infty_\Sigma}(E)(\Sigma \times \mathbb{R}) \,,

Then the formally adjoint differential operator (def. 58)

(D L) *:J Σ (E)× Σ(Σ×) *T Σ *E (\mathrm{D}_L)^\ast \;\colon\; J^\infty_\Sigma(E)\times_\Sigma (\Sigma \times \mathbb{R})^\ast \longrightarrow T_\Sigma^\ast E

of its evolutionary derivative, def. 68, regarded as a J Σ (E)J^\infty_\Sigma(E)-dependent differential operator D P\mathrm{D}_P from T ΣT_\Sigma to VV and applied to the constant section

1Γ Σ(Σ× *) 1 \in \Gamma_\Sigma(\Sigma \times \mathbb{R}^\ast)

is the Euler-Lagrange derivative (47)

δ ELL=(D L) *(1)Γ J Σ (E)(T Σ *)Ω Σ p+1,1(E) source \delta_{EL}\mathbf{L} \;=\; \left(\mathrm{D}_{L}\right)^\ast(1) \;\in\; \Gamma_{J^\infty_\Sigma(E)}(T_\Sigma^\ast) \simeq \Omega^{p+1,1}_\Sigma(E)_{source}

via the identification from example 56.

Proposition

(Euler-Lagrange derivative is derivation via evolutionary derivatives)

Let VvbΣV \overset{vb}{\to} \Sigma be a vector bundle (def. 7) and write V *ΣV^\ast \overset{}{\to} \Sigma for its dual vector bundle (def. 8).

For field-dependent sections (def. 67)

αΓ J Σ (E)(V) \alpha \in \Gamma_{J^\infty_\Sigma(E)}(V)

and

β *Γ J Σ (E)(V *) \beta^\ast \in \Gamma_{J^\infty_\Sigma(E)}(V^\ast)

we have that the Euler-Lagrange derivative (47) of their canonical pairing to a smooth function on the jet bundle (as in prop. 19) is the sum of the derivative of either one via the formally adjoint differential operator (def. 58) of the evolutionary derivative (def. 68) of the other:

δ EL(αβ *)=(D α) *(β *)+(D β *) *(α) \delta_{EL}( \alpha \cdot \beta^\ast ) \;=\; (\mathrm{D}_\alpha)^\ast(\beta^\ast) + (\mathrm{D}_{\beta^\ast})^\ast(\alpha)
Proof

It is sufficient to check this in local coordinates. By the product law for differentiation we have

δ EL(αβ *)δϕ a =(αβ *)ϕ addx μ((αβ *)ϕ ,μ a)+ddx μdx ν((αβ *)ϕ ,μν a) =+αϕ aβ *ddx μ(αϕ ,μ aβ *)+ddx μdx ν(αϕ ,μν aβ *) =+β *ϕ aαddx μ(β *ϕ ,μ aα)+ddx μdx ν(β *ϕ ,μν aα) =(D α) *(β *)+(D β *) *(α) \begin{aligned} \frac{ \delta_{EL} \left(\alpha \cdot \beta^\ast \right) } { \delta \phi^a } & = \frac{\partial \left(\alpha \cdot \beta^\ast \right)}{\partial \phi^a} - \frac{d}{d x^\mu} \left( \frac{\partial \left( \alpha \cdot \beta^\ast \right)}{\partial \phi^a_{,\mu}} \right) + \frac{d}{d x^\mu d x^\nu} \left( \frac{\partial \left( \alpha \cdot \beta^\ast \right) }{\partial \phi^a_{,\mu \nu}} \right) - \cdots \\ & = \phantom{+} \frac{\partial \alpha }{\partial \phi^a} \cdot \beta^\ast - \frac{d}{d x^\mu} \left( \frac{\partial \alpha }{\partial \phi^a_{,\mu}} \cdot \beta^\ast \right) + \frac{d}{d x^\mu d x^\nu} \left( \frac{\partial \alpha }{\partial \phi^a_{,\mu \nu}} \cdot \beta^\ast \right) - \cdots \\ & \phantom{=} + \frac{\partial \beta^\ast }{\partial \phi^a} \cdot \alpha - \frac{d}{d x^\mu} \left( \frac{\partial \beta^\ast }{\partial \phi^a_{,\mu}} \cdot \alpha \right) + \frac{d}{d x^\mu d x^\nu} \left( \frac{\partial \beta^\ast }{\partial \phi^a_{,\mu \nu}} \cdot \alpha \right) - \cdots \\ & = (\mathrm{D}_\alpha)^\ast(\beta^\ast) + (\mathrm{D}_{\beta^\ast})^\ast(\alpha) \end{aligned}
Proposition

(evolutionary derivative of Euler-Lagrange forms is formally self-adjoint)

Let (E,L)(E,\mathbf{L}) be a Lagrangian field theory (def. 60) over Minkowski spacetime (def. 23) and regard the Euler-Lagrange derivative

δ ELL=δ ELLdvol Σ \delta_{EL}\mathbf{L} \;=\; \delta_{EL}L \wedge dvol_\Sigma

(from prop. 22) as a field-dependent section of the vertical cotangent bundle

δ ELLΓ J Σ (E)(T Σ *E) \delta_{EL}L \;\in\; \Gamma_{J^\infty_\Sigma(E)}(T^\ast_\Sigma E)

as in example 56. Then the corresponding evolutionary derivative field-dependent differential operator D δ ELLD_{\delta_{EL}L} (def. 68) is formally self-adjoint (def. 58):

(D δ ELL) *=D δ ELL (D_{\delta_{EL}L})^\ast \;=\; D_{\delta_{EL}L}

(Olver 93, theorem 5.92) The following proof is due to Igor Khavkine.

Proof

By definition of the Euler-Lagrange form (def. 22) we have

δ ELLδϕ aδϕ advol Σ=δLdvol Σ+d(...). \frac{\delta_{EL} L }{\delta \phi^a} \delta \phi^a \, \wedge dvol_\Sigma \;=\; \delta L \,\wedge dvol_\Sigma \;+\; d(...) \,.

Applying the variational derivative δ\delta (def. 59) to both sides of this equation yields

(δδ ELLδϕ a)δϕ advol Σ=δδL=0dvol Σ+d(...). \left(\delta \frac{\delta_{EL} L }{\delta \phi^a}\right) \wedge \delta \phi^a \, \wedge dvol_\Sigma \;=\; \underset{= 0}{\underbrace{\delta \delta L}} \wedge dvol_\Sigma \;+\; d(...) \,.

It follows that for v,wv,w any two evolutionary vector fields the contraction (def. 13) of their prolongations v^\hat v and w^\hat w (def. 28) into the differential 2-form on the left is

(δδ ELLδϕ aδϕ a)(v,w)=w a(D δ EL) a(v)v b(D δ EL) b(w), \left( \delta \frac{\delta_{EL} L }{\delta \phi^a} \wedge \delta \phi^a \right)(v,w) = w^a (\mathrm{D}_{\delta_{EL}})_a(v) - v^b(\mathrm{D}_{\delta_{EL}})_b(w) \,,

by inspection of the definition of the evolutionary derivative (def. 68). Moreover, their contraction into the differential form on the right is

ι v^ι w^d(...)=d(...) \iota_{\hat v} \iota_{\hat w} d(...) \;=\; d(...)

by the fact (prop. 28) that contraction with prolongations of evolutionary vector fields antio-commutes with the total spacetime derivative (69).

Hence the last two equations combined give

w a(D δ EL) a(v)v b(D δ EL) b(w)=d(...). w^a (\mathrm{D}_{\delta_{EL}})_a(v) - v^b(\mathrm{D}_{\delta_{EL}})_b(w) \;=\; d(...) \,.

This is the defining condition for D δ EL\mathrm{D}_{\delta_{EL}} to be formally self-adjoint differential operator (def. 58).

\,

Now we may finally prove that an infinitesimal symmetry of the Lagrangian is also an infinitesimal symmetry of the Euler-Lagrange equations of motion:

Proposition

(infinitesimal symmetries of the Lagrangian are also infinitesimal symmetries of the equations of motion)

Let (E,L)(E,\mathbf{L}) be a Lagrangian field theory. If an evolutionary vector field vv is an infinitesimal symmetry of the Lagrangian then the flow along its prolongation v^\hat v preserves the prolonged shell J Σ (E)\mathcal{E}^\infty \hookrightarrow J^\infty_\Sigma(E) (49) in that the Lie derivative of the Euler-Lagrange form δ ELL\delta_{EL}\mathbf{L} along v^\hat v vanishes on \mathcal{E}^\infty:

v^L=d(...)AAAAAA v^δ ELL| =0. \mathcal{L}_{\hat v}\mathbf{L} = d(...) \phantom{AAA} \Rightarrow \phantom{AAA} \mathcal{L}_{\hat v} \, \delta_{EL}\mathbf{L}\vert_{\mathcal{E}^\infty} = 0 \,.
Proof

Notice that for any vector field v^\hat v the Lie derivative (prop. 3) v^\mathcal{L}_{\hat v} of the Euler-Lagrange form δ ELL=δ ELLδϕ aδϕ advol Σ\delta_{EL}\mathbf{L} = \frac{\delta_{EL}L}{\delta \phi^a} \delta \phi^a \wedge dvol_\Sigma differs from that of its component functions δ ELLδϕ advol Σ\frac{\delta_{EL}L}{\delta \phi^a} dvol_\Sigma by a term proportional to these component functions, which by definition vanishes on-shell:

v^(δ ELLδϕ aδϕ advol Σ)=( v^δ ELLδϕ a)=v^(δ ELLδϕ a)δϕ advol Σ+δ ELLδϕ a=0on ( v^δϕ a)dvol Σ \mathcal{L}_{\hat v} \left( \frac{\delta_{EL} L}{\delta \phi^a} \delta \phi^a \wedge dvol_\Sigma \right) \;=\; \underset{ = \hat v\left( \frac{\delta_{EL}L}{\delta \phi^a} \right) }{ \underbrace{ \left( \mathcal{L}_{\hat v} \frac{\delta_{EL}L}{\delta \phi^a} \right) } } \delta \phi^a \wedge dvol_\Sigma + \underset{ = 0 \, \text{on} \, \mathcal{E}^\infty }{ \underbrace{ \frac{\delta_{EL}L}{\delta \phi^a} } } \left( \mathcal{L}_{\hat v} \delta \phi^a \right) \wedge dvol_\Sigma

But the Lie derivative of the component functions is just their plain derivative. Therefore it is sufficient to show that

v^(δ ELLδϕ a)| =0. \hat v \left( \frac{\delta_{EL} L}{\delta \phi^a} \right) \vert_{\mathcal{E}^\infty} \;=\; 0 \,.

Now by Noether's theorem I (prop. 30) the condition v^=dJ˜ v^\mathcal{L}_{\hat v} = d \tilde J_{\hat v} for an infinitesimal symmetry of the Lagrangian implies that the contraction (def. 13) of the Euler-Lagrange form with the corresponding evolutionary vector field is a total spacetime derivative:

ι v^δ ELL=dJ v^. \iota_{\hat v} \, \delta_{EL}\mathbf{L} \;=\; d J_{\hat v} \,.

Since the Euler-Lagrange derivative vanishes on total spacetime derivative (example 50) also its application on the contraction on the left vanishes. But via example 56 that contraction is a pairing of field-dependent sections as in prop. 31. Hence we use this proposition to compute:

0 =δ EL(vδ ELL)δϕ a =(D v) *(δ ELL)+(D δ ELL) *(v) =(D v) *(δ ELL)+(D δ ELL)(v) =(D v) *(δ ELL)+v^(δ ELL). \begin{aligned} 0 & = \frac{\delta_{EL} \left( v \cdot \delta_{EL} L\right) }{ \delta \phi^a } \\ & = (\mathrm{D}_{v})^\ast( \delta_{EL}L ) + (\mathrm{D}_{\delta_{EL}L})^\ast(v) \\ & = (\mathrm{D}_{v})^\ast( \delta_{EL}L ) + (\mathrm{D}_{\delta_{EL}L})(v) \\ & = (\mathrm{D}_{v})^\ast( \delta_{EL}L ) + \hat v(\delta_{EL}L) \,. \end{aligned}

Here the first step is by prop. 31, the second step is by prop. 32 and the third step is (74).

Hence

v^(δ ELL)| =(D v) *(δ ELL)| =0, \begin{aligned} \hat v(\delta_{EL}L) \vert_{\mathcal{E}^\infty} & = - (\mathrm{D}_{v})^\ast( \delta_{EL}L ) \vert_{\mathcal{E}^\infty} \\ & = 0 \end{aligned} \,,

where in the last line we used that on the prolonged shell δ ELL\delta_{EL}L and all its horizontal derivatives vanish, by definition.

As a corollary we obtain:

Proposition

(flow along infinitesimal symmetry of the Lagrangian preserves on-shell space of field histories)

Let (E,L)(E,\mathbf{L}) be a Lagrangian field theory (def. 60).

For vv an infinitesimal symmetry of the Lagrangian (def. 66) the flow on the space of field histories (example 16) that it induces by def. 65 preserves the space of on-shell field histories (from prop. 22):

Γ Σ(E) δ ELL=0 Γ Σ(E) exp(v^)| δ ELL=0 exp(v^) Γ Σ(E) δ ELL=0 Γ Σ(E) \array{ \Gamma_\Sigma(E)_{\delta_{EL}\mathbf{L} = 0} &\hookrightarrow& \Gamma_\Sigma(E) \\ {\mathllap{\exp(\hat v)\vert_{\delta_{EL}\mathbf{L} = 0} }} \uparrow && \uparrow {\mathrlap{\exp(\hat v)}} \\ \Gamma_\Sigma(E)_{\delta_{EL}\mathbf{L} = 0} &\hookrightarrow& \Gamma_\Sigma(E) }
Proof

By def. 61 a field history ΦΓ Σ(E)\Phi \in \Gamma_\Sigma(E) is on-shell precisely if its jet prolongation j Σ (E)j^\infty_\Sigma(E) (def. 55) factors through the shell J Σ (E)\mathcal{E} \hookrightarrow J^\infty_\Sigma(E) (48). Hence by def. 65 the statement is equivalently that the ordinary flow (prop. 3) of v^\hat v (def. 28) on the jet bundle J Σ (E)J^\infty_\Sigma(E) preserves the shell. This in turn means that it preserves the vanishing locus of the Euler-Lagrange form δ ELL\delta_{EL} \mathbf{L}, which is the case by prop. 33.

\,

infinitesimal symmetries of the presymplectic potential current

Evidently Noether's theorem I in variational calculus (prop. 30) is the special case for horizontal p+1p+1-forms of a more general phenomenon relating symmetries of variational forms to forms that are closed up to a contraction. The same phenomenon applied instead to the presymplectic current yields the following:

Definition

(variational Lie derivative)

Let EfbΣE \overset{fb}{\to} \Sigma be a field bundle (def. 34) with jet bundle J Σ (E)J^\infty_\Sigma(E) (def. 54).

For vv a vertical tangent vector field on the jet bundle (a variation def. 63) write

(75) v varδι v+ι vδ \mathcal{L}^{var}_{v} \;\coloneqq\; \delta \circ \iota_v + \iota_v \circ \delta

for the variational Lie derivative along vv, analogous to Cartan's homotopy formula (prop. 3) but defined in terms of the variational derivative δ\delta (35) as opposed to the full de Rham differential.

Then for v 1v_1 and v 2v_2 two vertical vector fields, write

[v 1,v 2] varΓ(T vertJ Σ (E)) [v_1, v_2]^{var} \;\in \; \Gamma( T_{vert} J^\infty_\Sigma(E) )

for the vector field whose contraction operator (def. 13) is given by

ι [v 1,v 2] var =[ v 1 var,ι v 2] v 1 varι v 2ι v 2 v 1 var, \begin{aligned} \iota_{[v_1,v_2]^{var}} & = \left[ \mathcal{L}^{var}_{v_1}, \iota_{v_2} \right] \\ & \coloneqq \mathcal{L}^{var}_{v_1} \circ \iota_{v_2} - \iota_{v_2} \circ \mathcal{L}^{var}_{v_1} \end{aligned} \,,
Definition

(infinitesimal symmetry of the presymplectic potential and Hamiltonian differential forms)

Let (E,L)(E,\mathbf{L}) be a Lagrangian field theory (def. 60) with presymplectic potential current Θ BFV\Theta_{BFV} (50). Write J Σ (E)\mathcal{E} \hookrightarrow J^\infty_\Sigma(E) for the shell (48).

Then:

  1. An on-shell variation vv (def. 63) is an infinitesimal symmetry of the presymplectic current or Hamiltonian vector field if on-shell (def. 22) its variational Lie derivative along vv (def. 69) is a variational derivative:

    (δι v+ι vδ)Θ BFV=δH˜ vAAAon (\delta \circ \iota_v + \iota_v \circ \delta) \Theta_{BFV} = \delta \tilde H_v \phantom{AAA} \text{on}\, \mathcal{E}

    for some variational form H˜ v\tilde H_v.

  2. A Hamiltonian differential form HH (or local Hamiltonian current) is a variational form on the shell such that there exists a variation vv with

    δH=ι vΩ BFVAAon. \delta H = \iota_v \Omega_{BFV} \phantom{AA} \, \text{on}\, \mathcal{E} \,.

We write

Ω Σ,Ham p,0(E){(H,v)|vis a variation andι vΩ BFV=δH} \Omega^{p,0}_{\Sigma, Ham}(E) \;\coloneqq\; \left\{ (H,v) \;\vert\; v \, \text{is a variation and}\, \iota_v \Omega_{BFV} = \delta H \right\}

for the space of pairs consisting of a Hamiltonian differential forms on-shell and a corresponding variation.

Proposition

(Hamiltonian Noether's theorem)

A variation vv is an infinitesimal symmetry of the presymplectic potential (def. 70) with v var(Θ BFV)=δH˜ v\mathcal{L}^{var}_v ( \Theta_{BFV} ) = \delta \tilde H_v precisely if

H vH˜ vι vΘ BFV H_v \coloneqq \tilde H_v - \iota_v \Theta_{BFV}

is a Hamiltonian differential form for vv.

Proof

From the definition (75) of v var\mathcal{L}^{var}_v we have

v varΘ BFV=δH˜ v δι vΘ BFV+ι vδΘ BFV=Ω BFV=δH˜ v δ(H˜ vι vΘ BFV)=ι vΩ BFV, \begin{aligned} & \mathcal{L}^{var}_v \Theta_{BFV} = \delta \tilde H_v \\ \Leftrightarrow\;\; & \delta \iota_v \Theta_{BFV} + \iota_v \underset{= \Omega_{BFV}}{\underbrace{\delta \Theta_{BFV}}} = \delta \tilde H_v \\ \Leftrightarrow\;\; & \delta \left( \tilde H_v - \iota_v \Theta_{BFV} \right) = \iota_v \Omega_{BFV} \,, \end{aligned}

where we used the definition (52) of Ω BFV\Omega_{BFV} .

\,

Since therefore both the conserved currents from Noether's theorem as well as the Hamiltonian differential forms are generators of infinitesimal symmetries of certain variational forms (namely of the Lagrangian density and of the presymplectic current, respectively) they form a Lie algebra. For the conserved currents this is sometimes known as the Dickey bracket Lie algebra. For the Hamiltonian forms it is the Poisson bracket Lie p+1-algebra. Since here for simplicity we are considering just vertical variations, we have just a plain Lie algebra. The transgression of this Lie algebra of Hamiltonian forms on the jet bundle to Cauchy surfaces yields a presymplectic structure on phase space, this we discuss below.

Proposition

(local Poisson bracket)

Let (E,L)(E,\mathbf{L}) be a Lagrangian field theory (def. 60).

On the space Ω Σ,Ham p,0(E)\Omega^{p,0}_{\Sigma,Ham}(E) pairs (H,v)(H,v) of Hamiltonian differential forms HH with compatible variation vv (def. 70) the following operation constitutes a Lie bracket:

(76){(H 1,v 1),(H 2,v 2)}(ι v 1ι v 2Ω BFV,[v 1,v 2] var), \left\{(H_1, v_1),\, (H_2, v_2)\right\} \;\coloneqq\; (\iota_{v_1} \iota_{v_2} \Omega_{BFV},\, [v_1,v_2]^{var}) \,,

where [v 1,v 2] var[v_1, v_2]^{var} is the variational Lie bracket from def. 69.

We call this the local Poisson Lie bracket.

Proof

First we need to check that the bracket is well defined in itself. It is clear that it is linear and skew-symmetric, but what needs proof is that it does indeed land in Ω Σ,Ham p,0(E)\Omega^{p,0}_{\Sigma,Ham}(E), hence that the following equation holds:

δι v 2ι v 1Ω BFV=ι [v 1,v 2] varΩ BFV. \delta \iota_{v_2} \iota_{v_1} \Omega_{BFV} \;=\; \iota_{[v_1, v_2]^{var}} \Omega_{BFV} \,.

With def. 69 for var\mathcal{L}^{var} and [,] var[-,-]^{var} we compute this as follows:

δι v 1ι v 2Ω BFV =12δι v 1ι v 2Ω BFV12(v 1v 2) =12( v 1 varι v 2Ω BFVι v 1δι v 2Ω BFV)12(v 1v 2) =12( v 1 varι v 2Ω BFVι v 1 v 2 varΩ BFV+ι v 1ι v 2δΩ BFV=0)12(v 1v 2) =[ v 2 var,ι v 1]Ω BFV =ι [v 1,v 2] varΩ BFV. \begin{aligned} \delta \iota_{v_1} \iota_{v_2} \Omega_{BFV} & = \tfrac{1}{2} \delta \iota_{v_1} \iota_{v_2} \Omega_{BFV} - \tfrac{1}{2} (v_1 \leftrightarrow v_2) \\ & = \tfrac{1}{2} \left( \mathcal{L}^{var}_{v_1} \iota_{v_2} \Omega_{BFV} - \iota_{v_1} \delta \iota_{v_2} \Omega_{BFV} \right) - \tfrac{1}{2} (v_1 \leftrightarrow v_2) \\ & = \tfrac{1}{2} \left( \mathcal{L}^{var}_{v_1} \iota_{v_2} \Omega_{BFV} - \iota_{v_1} \mathcal{L}^{var}_{v_2} \Omega_{BFV} + \iota_{v_1} \iota_{v_2} \underset{= 0}{\underbrace{\delta \Omega_{BFV}}} \right) - \tfrac{1}{2} (v_1 \leftrightarrow v_2) \\ & = [\mathcal{L}^{var}_{v_2}, \iota_{v_1}] \Omega_{BFV} \\ & = \iota_{[v_1, v_2]^{var}} \Omega_{BFV} \,. \end{aligned}

This shows that the bracket is well defined.

It remains to see that the bracket satifies the Jacobi identity:

{(H 1,v 1),{(H 2,v 2),(H 3,v 3)}}+(cyclic)=0 \left\{ (H_1, v_1), \left\{ (H_2, v_2), (H_3,v_3) \right\} \right\} \;+\; (cyclic) \;=\; 0

hence that

(ι v 1ι [v 2,v 3] varΩ BFV,[v 1,[v 2,v 2] var] var)+(cyclic)=0. \left( \iota_{v_1} \iota_{[v_2,v_3]^{var}} \Omega_{BFV} ,\, [v_1, [v_2, v_2]^{var}]^{var} \right) \;+\; (cyclic) \;=\; 0 \,.

Here [v 1,[v 2,v 3] var] var+(cyclic)=0 [v_1, [v_2, v_3]^{var}]^{var} + (cyclic) = 0 holds because by def. 69 [v 1,] var[v_1,-]^{var} acts as a derivation, and hence what remains to be shown is that

ι v 1ι ([v 2,v 3] var)Ω BFV+(cyclic)=0 \iota_{v_1} \iota_{\left([v_2, v_3]^{var}\right)} \Omega_{BFV} + (cyclic) = 0

We check this by repeated uses of def. 69, using in addition that

  1. δι v iΩ BFV=0\delta \iota_{v_i} \Omega_{BFV} = 0

    (since ι v iΩ BFV=δH i\iota_{v_i} \Omega_{BFV} = \delta H_i by v iv_i being Hamiltonian)

  2. v i varΩ BFV=0\mathcal{L}^{var}_{v_i} \Omega_{BFV} = 0

    (since in addition δΩ BFV=0\delta \Omega_{BFV} = 0)

  3. ι v 1ι v 2ι v 3Ω BFV=0\iota_{v_1} \iota_{v_2} \iota_{v_3} \Omega_{BFV} = 0

    (since Ω BFVΩ Σ p,2(E)\Omega_{BFV} \in \Omega^{p,2}_\Sigma(E) is of vertical degree 2, and since all variations v iv_i are vertical by assumption).

So we compute as follows (a special case of FRS 13b, lemma 3.1.1):

0 =δι v 1ι v 2ι v 3Ω BFV = v 1 varι v 2ι v 3Ω BFVι v 1δι v 2ι v 3Ω BFV =ι [v 1,v 2] varι v 3Ω BFV+ι v 2 v 1 varι v 3Ω BFVι v 1 v 2 varι v 3Ω BFV+ι v 1ι v 2δι v 3Ω BFV =ι [v 1,v 2] varι v 3Ω BFV+ι v 2ι [v 1,v 3] varΩ BFVι v 1ι [v 2,v 3] varΩ BFV =ι v 1ι [v 2,v 3] varΩ BFVι v 2ι [v 3,v 1] varΩ BFVι v 3ι [v 1,v 2] varΩ BFV. \begin{aligned} 0 & = \delta \iota_{v_1} \iota_{v_2} \iota_{v_3} \Omega_{BFV} \\ & = \mathcal{L}^{var}_{v_1} \iota_{v_2} \iota_{v_3} \Omega_{BFV} - \iota_{v_1} \delta \iota_{v_2} \iota_{v_3} \Omega_{BFV} \\ & = \iota_{[v_1, v_2]^{var}} \iota_{v_3} \Omega_{BFV} + \iota_{v_2} \mathcal{L}^{var}_{v_1} \iota_{v_3} \Omega_{BFV} - \iota_{v_1} \mathcal{L}^{var}_{v_2} \iota_{v_3} \Omega_{BFV} + \iota_{v_1} \iota_{v_2} \delta \iota_{v_3} \Omega_{BFV} \\ & = \iota_{[v_1, v_2]^{var}} \iota_{v_3} \Omega_{BFV} + \iota_{v_2} \iota_{[v_1,v_3]^{var}} \Omega_{BFV} - \iota_{v_1} \iota_{[v_2, v_3]^{var}} \Omega_{BFV} \\ & = - \iota_{v_1} \iota_{[v_2, v_3]^{var}} \Omega_{BFV} - \iota_{v_2} \iota_{[v_3, v_1]^{var}} \Omega_{BFV} - \iota_{v_3} \iota_{[v_1, v_2]^{var}} \Omega_{BFV} \,. \end{aligned}

\,

The local Poisson bracket Lie algebra (Ω Σ,Ham p,0(E),[,] var)(\Omega^{p,0}_{\Sigma,Ham}(E), [-,-]^{var}) from prop. 36 is but the lowest stage of a higher Lie theoretic structure called the Poisson bracket Lie p-algebra. Here we will not go deeper into this higher structure (see at Higher Prequantum Geometry for more), but below we will need the following simple shadow of it:

Lemma

The horizontally exact Hamiltonian forms constitute a Lie ideal for the local Poisson Lie bracket (76).

Proof

Let EE be a horizontally exact Hamiltonian form, hence

E=dK E = d K

for some KK. Write ee for a Hamiltonian vector field for EE.

Then for (H,v)(H,v) any other pair consisting of a Hamiltonian form and a corresponding Hamiltonian vector field, we have

ι vι eΩ BFV =ι vδE =ι vδdK =ι vdδK =dι vδK. \begin{aligned} \iota_v \, \iota_e \, \Omega_{BFV} & = \phantom{-}\iota_v \, \delta E \\ & = \phantom{-}\iota_v \, \delta \, d \, K \\ & = - \iota_v \, d \, \delta K \\ & = \phantom{-}d \, \iota_v \, \delta \, K \,. \end{aligned}

Here we used that the horizontal derivative anti-commutes with the vertical one by construction of the variational bicomplex, and that ι e\iota_e anti-commutes with the horizontal derivative dd since the variation ee (def. 63) is by definition vertical.

Example

(local Poisson bracket for real scalar field)

Consider the Lagrangian field theory for the free real scalar field from example 39.

By example 45 its presymplectic current is

Ω BFV=η μνδϕ ,μδϕι μdvol Σ \Omega_{BFV} = \eta^{\mu \nu} \delta \phi_{,\mu} \wedge \delta \phi \wedge \iota_{\partial_\mu} dvol_\Sigma \,

The corresponding local Poisson bracket algebra (prop. 36) has in degree 0 Hamiltonian forms (def. 35) such as

Qϕι 0dvol ΣΩ p,0(E) Q \;\coloneqq\; \phi \,\iota_{\partial_0} dvol_\Sigma \in \Omega^{p,0}(E)

and

Pη μνϕ ,μι νdvol ΣΩ p,0(E). P \;\coloneqq\; \eta^{\mu \nu} \phi_{,\mu} \, \iota_{\partial_\nu} dvol_{\Sigma} \in \Omega^{p,0}(E) \,.

The corresponding Hamiltonian vector fields are

v Q= ϕ ,0 v_Q = -\partial_{\phi_{,0}}

and

v P= ϕ. v_P = - \partial_{\phi} \,.

Hence the corresponding local Poisson bracket is

{P,Q}=ι v Pι v Qω=ι 0dvol Σ. \{P,Q\} = \iota_{v_P} \iota_{v_Q} \omega = \iota_{\partial_0} dvol_\Sigma \,.

More generally for b 1,b 2C cp (Σ)b_1, b_2 \in C^\infty_{cp}(\Sigma) two bump functions then

{b 1P,b 2Q}=b 1b 2ι 0dvol Σ. \{ b_1 P, b_2 Q \} = b_1 b_2 \iota_{\partial_0} dvol_\Sigma \,.
Example

(local Poisson bracket for free Dirac field)

Consider the Lagrangian field theory of the free Dirac field on Minkowski spacetime (example 43), whose presymplectic current is, according to example 49, given by

(77)Ω BFV=(δψ¯)γ μ(δψ)ι μdvol Σ. \Omega_{BFV} \;=\; (\overline{\delta \psi}) \, \gamma^\mu \, (\delta \psi) \, \iota_{\partial_\mu} dvol_\Sigma \,.

Consider this specifically in spacetime dimension p+1=4p + 1 = 4 in which case the components ψ α\psi_\alpha are complex number-valued (by prop./def. 10), so that the tuple (ψ α)(\psi_\alpha) amounts to 8 real-valued coordinate functions. By changing complex coordinates, we may equivalently consider (ψ α)(\psi_\alpha) as four coordinate functions, and (ψ¯ α)(\overline{\psi}^\alpha) as another four independent coordinate functions.

Using this coordinate transformation, it is immediate to find the following pairs of Hamiltonian vector fields and their Hamiltonian differential forms from def. 70 applied to (77)

Hamiltonian vector fieldHamiltonian differential form
AA ψ α\phantom{AA} \partial_{\psi_\alpha}AA(δψ¯γ μ) αι μdvol Σ\phantom{AA}\left(\overline{\delta \psi}\gamma^\mu\right)^\alpha\, \iota_{\partial_\mu} dvol_\Sigma
AA ψ¯ α\phantom{AA} \partial_{\overline{\psi}_\alpha}AA(γ μψ) αι μdvol Σ\phantom{AA}\left( \gamma^\mu \psi \right)_\alpha \, \iota_{\partial_\mu} dvol_\Sigma

and to obtain the following non-trivial local Poisson brackets (prop. 36) (the other possible brackets vanish):

{(γ μψ) αι μdvol Σ,(ψ¯γ μ) βι μdvol Σ}=(γ μ) α βι μdvol Σ. \left\{ \left( \gamma^\mu \psi \right)_\alpha \, \iota_{\partial_\mu} dvol_\Sigma \,,\, \left(\overline{\psi}\gamma^\mu\right)^\beta\, \iota_{\partial_\mu} dvol_\Sigma \right\} \;=\; \left(\gamma^\mu\right)_\alpha{}^{\beta} \, \iota_{\partial_\mu} dvol_\Sigma \,.

Notice the signs: Due to the odd-grading of the field coordinate function ψ\psi, its variational derivative δψ\delta \psi has bi-degree (1,odd)(1,odd) and the contraction operation ι ψ\iota_{\psi} has bi-degree (1,odd)(-1,odd), so that commuting it past ψ¯\overline{\psi} picks up two minus signs, a “cohomological” sign due to the differential form degrees, and a “supergeometric” one (def. 47):

ι ψδψ¯=(1)(1)δψ¯ι ψ. \iota_{\partial_\psi} \overline{\delta \psi} \cdots = (-1) (-1) \overline{\delta \psi} \,\iota_{\partial_\psi} \cdots \,.

For the same reason, the local Poisson bracket is a super Lie algebra with symmetric super Lie bracket:

{(γ μψ) αι μdvol Σ,(ψ¯γ μ) βι μdvol Σ}=+{(ψ¯γ μ) βι μdvol Σ,(γ μψ) αι μdvol Σ}. \left\{ \left( \gamma^\mu \psi \right)_\alpha \, \iota_{\partial_\mu} dvol_\Sigma \,,\, \left(\overline{\psi}\gamma^\mu\right)^\beta\, \iota_{\partial_\mu} dvol_\Sigma \right\} \;=\; + \left\{ \left(\overline{\psi}\gamma^\mu\right)^\beta\, \iota_{\partial_\mu} dvol_\Sigma \,,\, \left( \gamma^\mu \psi \right)_\alpha \, \iota_{\partial_\mu} dvol_\Sigma \right\} \,.

\,

This concludes our discussion of general infinitesimal symmetries of a Lagrangian. We pick this up again in the discussion of Gauge symmetries below. First, in the next chapter we discuss the concept of observables in field theory.

\,

Observables

Given a Lagrangian field theory (def. 39), then a general observable quantity or just observable for short (def. 71 below), is a smooth function

A:Γ Σ(E) δ ELL=0 A \;\colon\; \Gamma_\Sigma(E)_{\delta_{EL}\mathbf{L} = 0} \longrightarrow \mathbb{C}

on the on-shell space of field histories (example 16, example 33) hence a smoothfunctional” of field histories. We think of this as assigning to each physically realizable field history Φ\Phi the value A(Φ)A(\Phi) of the given quantity as exhibited by that field history. For instance concepts like “average field strength in the compact spacetime region 𝒪\mathcal{O}” should be observables. In particular the field amplitude at spacetime point xx should be an observable, denoted Φ a(x)\mathbf{\Phi}^a(x).

In much of the literature on field theory, these point evaluation observables Φ a(x)\mathbf{\Phi}^a(x) (example below 60) are eventually referred to as “fields” themselves, blurring the distinction between

  1. field species/field bundles EE,

  2. field histories/sections Φ\Phi,

  3. functions on the space of field histories Φ a(x)\mathbf{\Phi}^a(x).

In particular, the process of quantization (discussed in Quantization below) affects the third of these concepts only, in that it deforms the algebra structure on observables to a non-commutative algebra of quantum observables. For this reason the observables Φ a(x)\mathbf{\Phi}^a(x) are often referred to as quantum fields. But to understand the conceptual nature of quantum field theory it is important that the Φ a(x)\mathbf{\Phi}^a(x) are really the observables or quantum observables on the space of field histories.

fields

aspecttermtypedescriptiondef.
field componentϕ a\phi^a, ϕ ,μ a\phi^a_{,\mu}J Σ (E)J^\infty_\Sigma(E) \to \mathbb{R}coordinate function on jet bundle of field bundledef. 34, def. 54
field historyΦ\Phi, Φx μ\frac{\partial \Phi}{\partial x^\mu}ΣJ Σ (E)\Sigma \to J^\infty_\Sigma(E)jet prolongation of section of field bundledef. 34, def. 55
field observableΦ a(x)\mathbf{\Phi}^a(x), μΦ a(x),\partial_{\mu} \mathbf{\Phi}^a(x), Γ Σ(E)\Gamma_{\Sigma}(E) \to \mathbb{R}derivatives of delta-functional on space of sectionsdef. 71, example 60
averaging of field observableα *Σα a *(x)Φ a(x)dvol Σ(x)\alpha^\ast \mapsto \underset{\Sigma}{\int} \alpha^\ast_a(x) \mathbf{\Phi}^a(x) \, dvol_\Sigma(x)Γ Σ,cp(E *)Obs(E scp,L)\Gamma_{\Sigma,cp}(E^\ast) \to Obs(E_{scp},\mathbf{L})observable-valued distributiondef. 80
algebra of quantum observables(Obs(E,L) μc,)\left( Obs(E,\mathbf{L})_{\mu c},\, \star\right)Alg\mathbb{C}Algnon-commutative algebra structure on field observablesdef. 127, def. 132

\,

There are various further conditions on observables which we will eventually consider, forming subspaces of gauge invariant observables (def. 98), local observables (def. 83 below), Hamiltonian local observables (def. 89 below) and microcausal observables (def. 126). While in the end it is only these special kinds of observables that matter, it is useful to first consider the unconstrained concept and then consecutively characterize smaller subspaces of well-behaved observables. In fact it is useful to consider yet more generally the observables on the full space of field histories (not just the on-shell subspace), called the off-shell observables.

In the case that the field bundle is a vector bundle (example 9), the off-shell space of field histories is canonically a vector space and hence it makes sense to consider linear off-shell observables, i.e. those observables AA with A(cΦ)=cA(Φ)A(c \Phi) = c A(\Phi) and A(Φ 1+Φ 2)=A(Φ 1)+A(Φ 2)A(\Phi_1 + \Phi_2) = A(\Phi_1) + A(\Phi_2). It turns out that these are precisely the compactly supported distributions in the sense of Laurent Schwartz (prop. 37 below). This fact makes powerful tools from functional analysis and microlocal analysis available for the analysis of field theory (discussed below).

More generally there are the multilinear off-shell observables, and these are analogously given by distributions of several variables (def. 76 below). In fully perturbative quantum field theory one considers only the infinitesimal neighbourhood (example 27) of a single on-shell field history and in this case all observables are in fact given by such multilinear observables (def. 84 below).

For a free field theory (def. 62) whose Euler-Lagrange equations of motion are given by a linear differential operator which behaves well in that it is “Green hyperbolic” (def. 79 below) it follows that the actual on-shell linear observables are equivalently those off-shell observables which are spatially compactly supported distributional solutions to the formally adjoint equation of motion (prop. 43 below); and this equivalence is exhibited by composition with the causal Green function (def. 78 below):

This is theorem 1 below, which is pivotal for passing from classical field theory to quantum field theory:

{polynomial observables}causal propagator{spatially compactly supported distributions in several variables which are distributional solutions to the adjoint equations of motion} \left\{ \,\, \array{ \text{polynomial} \\ \text{observables} } \,\, \right\} \underoverset{\simeq}{ \text{causal propagator} }{\longleftarrow} \left\{ \array{ \text{spatially compactly supported} \\ \text{distributions in several variables} \\ \text{which are distributional solutions} \\ \text{to the adjoint equations of motion} } \right\}

This fact makes, in addition, the distributional analysis of linear differential equations available for the analysis of free field theory, notably the theory of propagators, such as Feynman propagators (def. 108 below), which we turn to in Propagators below.

The functional analysis and microlocal analysis (below) of linear observables re-expressed in distribution theory via theorem 1 solves the issues that the original formulation of perturbative quantum field theory by Schwinger-Tomonaga-Feynman-Dyson in the 1940s was notorious for suffering from (Feynman 85): The normal ordered product of quantum observables in a Wick algebra of observables follows from Hörmander's criterion for the product of distributions to be well-defined (this we discuss in Free quantum fields below) and the renormalization freedom in the construction of the S-matrix is governed by the mechanism of extensions of distributions (this we discuss in Renormalization below).

Among the polynomial on-shell observables characterized this way, the focus is furthermore on the local observables:

In local field theory the idea is that both the equations of motion as well as the observations are fully determined by their restriction to infinitesimal neighbourhoods of spacetime points (events). For the equations of motion this means that they are partial differential equations as we have seen above. For the observables it should mean that they must be averages over regions of spacetime of functions of the value of the field histories and their derivatives at any point of spacetime. Now a “smooth function of the value of the field histories and their derivatives at any point” is precisely a smooth function on the jet bundle of the field bundle (example 54) pulled back via jet prolongation (def. 55). If this is to be averaged over spacetime it needs to be the coefficient of a horizontal p+1p+1-form (prop. 59).

In mathematical terminology these desiderata say that the local observables in a local field theory should be precisely the “transgressions” (def. 82 below) of horizontal variational p+1p+1-forms (with compact spacetime support, def. 81 below) to the space of field histories (example 16). This is def. 83 below.

A key example of a local observable in Lagrangian field theory (def. 60) is the action functional (example 66 below). This is the transgression of the Lagrangian density itself, or rather of its product with an “adiabatic switching function” that localizes its support in a compact spacetime region. In typical cases the physical quantity whose observation is represented by the action functional is the difference of the kinetic energy-momentum minus the potential energy of a field history averaged over the given region of spacetime.

The equations of motion of a Lagrangian field theory say that those field histories are physically realized which are critical points of this action functional observable. This is the principle of extremal action (prop. 45 below).

This formalizes what it means for a field history Φ\Phi to be “realizable” (physically admissible) (a solution to the Euler-Lagrange equations, def. 61) and what the (local) observable quantities on field histories are (def. 83). It remains to formalize what it means for the physical system to be in some definite state so that the observable quantities take some definite value, reflecting the properties of that state.

Whatever formalization for states of a field theory one considers, at the very least the space of states StatesStates should come with a pairing linear map

ObsStates 𝒞 (A,) A \array{ Obs \otimes States & \longrightarrow& \mathcal{C} \\ \left( A , \langle - \rangle \right) &\mapsto& \langle A \rangle }

which reads in an observable quantity AA and a state, to be denoted \langle - \rangle, and produces the complex number A\langle A \rangle which is the “value of the observable quantity AA in the case that the physical system is in the state \langle -\rangle”.

One might imagine that it is fundamentally possible to pinpoint the exact field history that the physical system is found in. From this perspective, fixing a state should simply mean to pick such a field history, namely an element ΦΓ Σ(E) δ ELL=0\Phi \in \Gamma_{\Sigma}(E)_{\delta_{EL}\mathbf{L} = 0} in the on-shell space of field histories. If we write Φ\langle -\rangle_{\Phi} for this state, its pairing map with the observables would simply be evaluation of the observable, being a function on the field history space, on that particular element in this space:

A ΦA(Φ). \langle A \rangle_{\Phi} \coloneqq A(\Phi) \,.

However, in the practice of experiment a field history can never be known precisely, without remaining uncertainty. Moreover, quantum physics (to which we finally come below), suggests that this is true not just in practice, but even in principle. Therefore we should allow states to be a kind of probability distributions on the space of field histories, and regard the pairing A\langle A \rangle of a state \langle - \rangle with an observable AA as a kind of expectation value of the function AA averaged with respect to this probability distribution. Specifically, if the observable quantity AA is (a smooth approximation to) a characteristic function of a subset SΓ Σ(E) δ ELL=0S \subset \Gamma_{\Sigma}(E)_{\delta_{EL}\mathbf{L} = 0} of the space of field histories, then its value in a given state should be the probability to find the physical system in that subset of field histories.

But, moreover, the superposition principle of quantum physics says that the actually observable observables are only those of the form A *AA^\ast A (for A *A^\ast the image under the star-operation on the star algebra of observables.

This finally leads to the definition of states in def. 86 below.

\,

We now discuss these topics:

\,

\,

General observables

Definition

(observables)

Let (E,L)(E,\mathbf{L}) be a Lagrangian field theory (def. 60) with Γ Σ(E) δ ELL=0\Gamma_\Sigma(E)_{\delta_{EL} \mathbf{L} = 0} its on-shell space of field histories (def. 61).

Then the space of observables is the super formal smooth set (def. 48) which is the mapping space

Obs(E,L)[Γ Σ(E) δ ELL=0,] Obs(E,\mathbf{L}) \;\coloneqq\; \left[ \Gamma_\Sigma(E)_{\delta_{EL}\mathbf{L} = 0} \,,\, \mathbb{C} \right]

from the on-shell space of field histories to the complex numbers.

Similarly there is the space of off-shell observables

Obs(E)[Γ Σ(E),]. Obs(E) \;\coloneqq\; \left[ \Gamma_\Sigma(E) \,,\, \mathbb{C} \right] \,.

Every off-shell observables induces an on-shell observable by restriction, this yields a smooth function

(78)Obs(E)() δ ELL=0Obs(E,L) Obs(E) \overset{(-)_{\delta_{EL}\mathbf{L} = 0}}{\longrightarrow} Obs(E,\mathbf{L})

similarly we may consider the observables on the sup-spaces of field histories with restricted causal support according to def. 31. We write

Obs(E scp)[Γ Σ,scp(E),] Obs(E_{scp}) \;\coloneqq\; \left[ \Gamma_{\Sigma,scp}(E), \mathbb{C} \right]

and

(79)Obs(E scp,L)[Γ Σ,scp(E) δ ELL=0,] Obs(E_{scp}, \mathbf{L}) \;\coloneqq\; \left[ \Gamma_{\Sigma,scp}(E)_{\delta_{EL} \mathbf{L} = 0}, \mathbb{C} \right]

for the spaces of (off-shell) observables on field histories with spatially compact support (def. 31).

\,

Observables on bosonic fields

In the case that EE is a purely bosonic field bundle in smooth manifolds so that Γ Σ(E) δ ELL=0\Gamma_\Sigma(E)_{\delta_{EL}\mathbf{L} = 0} is a diffeological space (def. 16, def. 61) this means that a single observable AObs E,LA \in Obs_{E,\mathbf{L}} is equivalently a smooth function (def. 35)

A:Γ Σ(E) δ ELL=0. A \;\colon\; \Gamma_\Sigma(E)_{\delta_{EL} \mathbf{L} = 0} \longrightarrow \mathbb{C} \,.

Explicitly, by def. 36 (and similarly by def. 48) this means that AA is for each Cartesian space UU (generally: super Cartesian space, def. 46) a natural function of plots

A U:{U×Σ Φ () E pr 2 fb Σ} δ ELL=0{U}. A_U \;\colon\; \left\{ \array{ U \times \Sigma && \overset{\Phi_{(-)}}{\longrightarrow} && E \\ & {}_{\mathllap{pr_2}}\searrow && \swarrow_{\mathrlap{fb}} \\ && \Sigma } \right\}_{\delta_{EL}\mathbf{L} = 0} \;\overset{}{\longrightarrow}\; \left\{ U \to \mathbb{C} \right\} \,.

Observables on fermionic fields

In the case that EE has purely fermionic fibers (def. 50), such as for the Dirac field (example 35) with E=Σ×S oddE = \Sigma\times S_{odd} then the only point in Obs E,LObs_{E,\mathbf{L}} is the zero-observable, instead an observable is now a morphism

(θθA): 0|1Obs E,L (\theta \mapsto \theta A) \;\colon\; \mathbb{R}^{0\vert 1} \longrightarrow Obs_{E,\mathbf{L}}

and its component AA is a bosonic observable as above.

The most basic kind of observables are the following:

Example

(point evaluation observables – field observables)

Let (E,L)(E,\mathbf{L}) be a Lagrangian field theory (def. 60) whose field bundle (def. 34) over some spacetime Σ\Sigma happens to be a trivial vector bundle in even degree (i.e. bosonic) with field fiber coordinates (ϕ a)(\phi^a) (example 9). With respect to these coordinates a field history, hence a section of the field bundle

ΦΓ Σ(E) \Phi \;\in \; \Gamma_\Sigma(E)

has components (Φ a)(\Phi^a) which are smooth functions on spacetime.

Then for every index aa and every point xΣx \in \Sigma in spacetime (every event) there is an observable (def. 71) denoted Φ a(x)\mathbf{\Phi}^a(x) which is given by

Φ a(x):Φ ()Φ () a(x), \mathbf{\Phi}^a(x) \;\colon\; \Phi_{(-)} \mapsto \Phi_{(-)}^a(x) \,,

hence which on a test space UU (a Cartesian space or more generally super Cartesian space, def. 46) sends a UU-parameterized collection of fields

Φ ():UΓ Σ(E) δ ELL=0 \Phi_{(-)} \colon U \to \Gamma_\Sigma(E)_{\delta_{EL}\mathbf{L} = 0}

to their UU-parameterized collection of values at xx of their aa-th component.

Notice how the various aspects of the concept of “field” are involved here, all closely related but crucially different:

Φ a(x) : Φ AA Φ a(x) = ϕ a Φ(x) field observable field history field value field component \array{ \mathbf{\Phi}^a(x) &\colon& \Phi &\overset{\phantom{AA}}{\mapsto}& \Phi^a(x) &=& \phi^a & \circ \Phi(x) \\ \array{ \text{field} \\ \text{observable} } && \array{ \text{field} \\ \text{history} } && \array{ \text{field} \\ \text{value} } && \array{ \text{field} \\ \text{component} } }

\,

Polynomial off-shell Observables and Distributions

We consider here linear observables (def. 72 below) and more generally quadratic observables (def. 75) and generally polynomial observables (def. 76 below) for free field theories and discuss how these are equivalently given by integration against generalized functions called distributions (prop. 37 and prop. 38 below).

This is the basis for the discussion of quantum observables for free field theories further below.

\,

Definition

(linear off-shell observables)

Let (E,L)(E,\mathbf{L}) be a Lagrangian field theory (def. 60) whose field bundle EE (def. 34) is a super vector bundle (as in example 9 and as opposed to more general non-linear fiber bundles).

This means that the off-shell space of field histories Γ Σ(E)\Gamma_\Sigma(E) (example 33) inherits the structure of a super vector space by spacetime-pointwise (i.e. event-wise) scaling and addition of field histories.

Then an off-shell observable (def. 71)

A:Γ Σ(E) δ ELL=0 A \;\colon\; \Gamma_\Sigma(E)_{\delta_{EL}\mathbf{L} = 0} \longrightarrow \mathbb{C}

is a linear observable if it is a linear function with respect to this vector space structure, hence if

A(cΦ ())=cA(Φ ())AAAAandAAAAA(Φ ()+Φ ())=A(Φ ())+A(Φ ()) A\left( c \Phi_{(-)}) = c A(\Phi_{(-)} \right) \phantom{AAAA} \text{and} \phantom{AAAA} A\left(\Phi_{(-)} + \Phi'_{(-)} \right) = A\left( \Phi_{(-)}) + A(\Phi'_{(-)} \right)

for all plots of field histories Φ (),Φ ()\Phi_{(-)}, \Phi'_{(-)}.

We write

LinObs(E,L)Obs(E,L) LinObs(E,\mathbf{L}) \hookrightarrow Obs(E,\mathbf{L})

for the subspace of linear observables inside all observables (def. 71) and similarly

LinObs(E)Obs(E) LinObs(E) \hookrightarrow Obs(E)

for the linear off-shell observables inside all off-shell observables, and similarly for the subspaces of linear oobservables on field histories of spatially compact supprt (79):

(80)LinObs(E scp,L)Obs(E scp,L) LinObs(E_{scp}, \mathbf{L}) \hookrightarrow Obs(E_{scp}, \mathbf{L})

and

LinObs(E scp)Obs(E scp). LinObs(E_{scp}) \hookrightarrow Obs(E_{scp}) \,.
Example

(point evaluation observables are linear)

Let (E,L)(E,\mathbf{L}) be a Lagrangian field theory (def. 60) over Minkowski spacetime (def. 23), whose field bundle EE (def. 34) is the trivial vector bundle with field coordinates (ϕ a)(\phi^a) (example 9).

Then for each field component index aa and point xΣx \in \Sigma of spacetime (each event) the point evaluation observable (example 60)

Γ Σ(E) δ ELL=0 Φ a(x) ϕ ϕ a(x) \array{ \Gamma_\Sigma(E)_{\delta_{EL}\mathbf{L} = 0} &\overset{\mathbf{\Phi}^a(x)}{\longrightarrow}& \mathbb{C} \\ \phi &\mapsto& \phi^a(x) }

is a linear observable according to def. 72. The distribution that it corresponds to under prop. 37 is the Dirac delta-distribution at the point xx combined with the Kronecker delta on the index aa: In the generalized function-notation of remark 19 this reads:

Φ a(x):Φ ΣΦ b(y)δ b aδ(x,y)dvol Σ(y). \Phi^a(x) \;\colon\; \Phi \mapsto \int_\Sigma \Phi^b(y) \delta_b^a \delta(x,y) \, dvol_\Sigma(y) \,.
Proposition

(linear off-shell observables of scalar field are the compactly supported distributions)

Let (E,L)(E,\mathbf{L}) be a Lagrangian field theory (def. 60) over Minkowski spacetime (def. 23), whose field bundle EE (def. 34) is the trivial real line bundle (as for the real scalar field, example 10). This means that the off-shell space of field histories Γ Σ(E)C (Σ)\Gamma_\Sigma(E) \simeq C^\infty(\Sigma) (19) is the real vector space of smooth functions on Minkowski spacetime and that every linear observable AA (def. 72) gives a linear function

A *:C (Σ) δ ELL=0. A_\ast \;\colon\; C^\infty(\Sigma)_{\delta_{EL}\mathbf{L} = 0} \longrightarrow \mathbb{C} \,.

This linear function A *A_\ast is in fact a compactly supported distribution, in the sense of functional analysis, in that it satisfies the following Fréchet vector space continuity condition:

  • Fréchet continuous linear functional

    A linear function A *:C ( p,1)A_\ast \;\colon\; C^\infty(\mathbb{R}^{p,1}) \to \mathbb{R} is called continuous if there exists

    1. a compact subset K p,1K \subset \mathbb{R}^{p,1} of Minkowski spacetime;

    2. a natural number kk \in \mathbb{N};

    3. a positive real number C +C \in \mathbb{R}_+

    such that for all on-shell field histories

    ΦC (Σ) δ ELL=0 \Phi \in C^\infty(\Sigma)_{\delta_{EL}\mathbf{L} = 0}

    the following inequality of absolute values ||{\vert -\vert} of partial derivatives holds

    |A *(Φ)|C|α|ksupxK| αΦ(x)|, {\vert A_\ast(\Phi)\vert} \;\leq\; C \underset{{\vert \alpha \vert} \leq k}{\sum} \, \underset{x \in K}{sup} {\vert \partial^\alpha \Phi(x)\vert} \,,

    where the sum is over all multi-indices α p+1\alpha \in \mathbb{N}^{p+1} (1) whose total degree |α|α 0++α p{\vert \alpha\vert} \coloneqq \alpha_0 + \cdots + \alpha_{p} is bounded by kk, and where

    αΦ |α|Φ α 0x 0 α 1x 1 α px p \partial^\alpha \Phi \;\coloneqq\; \frac{\partial^{{\vert \alpha\vert}} \Phi }{ \partial^{\alpha_0} x^0 \partial^{\alpha_1} x^1 \cdots \partial^{\alpha^p} x^p }

    denotes the corresponding partial derivative (1).

This identification constitutes a linear isomorphism

LinObs(Σ×) (Σ) linear off-shell observables of the scalar field compactly supported distributions on spacetime, \array{ LinObs(\Sigma \times \mathbb{R}) &\overset{\simeq}{\longrightarrow}& \mathcal{E}'(\Sigma) \\ \array{ \text{linear off-shell} \\ \text{observables} \\ \text{of the scalar field} } && \array{ \text{compactly supported} \\ \text{distributions} \\ \text{on spacetime} } } \,,

saying that all compactly supported distributions arise from linear off-shell observables of the scalar field this way, and uniquely so.

For proof see at distributions are the smooth linear functionals, this prop.

The identification from prop. 37 of linear off-shell observables with compactly supported distributions makes available powerful tools from functional analysis. The key fact is the following:

Proposition

(distributions are generalized functions)

For nn \in \mathbb{N}, every compactly supported smooth function bC cp ( n)b \in C^\infty_{cp}(\mathbb{R}^n) on the Cartesian space n\mathbb{R}^n induces a distribution (prop. 37), hence a continuous linear functional, by integration against bb times the volume form.

C ( n) f nf(x)b(x)dvol(x) \array{ C^\infty(\mathbb{R}^n) &\longrightarrow& \mathbb{R} \\ f &\mapsto& \int_{\mathbb{R}^n} f(x) b(x) \, dvol(x) }

The distributions arising this way are called the non-singular distributions.

This construction is clearly a linear inclusion

C cp ( n)AAA( n) C^\infty_{cp}(\mathbb{R}^n) \overset{\phantom{AAA}}{\hookrightarrow} \mathcal{E}'(\mathbb{R}^n)

and in fact this is a dense subspace inclusion for the space of compactly supported distributions ( n)\mathcal{E}'(\mathbb{R}^n) equipped with the dual space topology (this def.) to the Fréchet space structure on C ( n)C^\infty(\mathbb{R}^n) from prop. 37.

Hence every compactly supported distribution uu is the limit of a sequence {b n} n\{b_n\}_{n \in \mathbb{N}} of compactly supported smooth functions in that for every smooth function fC ( n)f \in C^\infty(\mathbb{R}^n) we have that the value u(f)u(f) \in \mathbb{R} is the limit of integrals against b ndvolb_n dvol:

u(f)=limn nf(x)b n(x)dvol(x). u(f) \;=\; \underset{n \to \infty}{\lim}\, \int_{\mathbb{R}^n} f(x) b_n(x) dvol(x) \,.

(e. g. Hörmander 90, theorem 4.1.5)

Proposition 38 with prop. 37 implies that with due care we may think of all linear off-shell observables as arising from integration of field histories against some “generalized smooth functions” (namely a limit of actual smooth functions):

Remark

(linear off-shell observables of real scalar field as integration against generalized functions)

Let (E,L)(E,\mathbf{L}) be a Lagrangian field theory (def. 60) over Minkowski spacetime (def. 23), whose field bundle EE (def. 34) is a trivial vector bundle with field coordinates (ϕ a)(\phi^a).

Prop. 37 implies immediately that in this situation linear off-shell observables AA (def. 72) correspond to tuples (A a)(A_a) of compactly supported distributions via

A(Φ)=aA a(Φ a). A(\Phi) = \underset{a}{\sum} A_a(\Phi^a) \,.

With prop. 38 it follows furthermore that there is a sequence of tuples of smooth functions {(α n) a} n\{(\alpha_n)_{a}\}_{n \in \mathbb{N}} such that A aA_a is the limit of the integrations against these:

A(Φ)=limn ΣΦ a(x)(α n) a(x)dvol(x), A(\Phi) \;=\; \underset{n \to \infty}{\lim} \, \int_\Sigma \Phi^a(x) (\alpha_n)_a(x) \, dvol(x) \,,

where now the sum over the index aa is again left notationally implicit.

For handling distributions/linear off-shell observables it is therefore useful to adopt, with due care, shorthand notation as if the limits of the sequences of smooth functions (α n) a(\alpha_n)_a actually existed, as “generalized functionsα a\alpha_a, and to set

ΣΦ a(x)α a(x)dvol(x)A(Φ), \int_\Sigma \Phi^a(x) \alpha_a(x) \, dvol(x) \;\coloneqq\; A(\Phi) \,,

This suggests that basic operations on functions, such as their pointwise product, should be extended to distributions, e.g. to a product of distributions. This turns out to exist, as long as the high-frequency modes in the Fourier transform of the distributions being multiplied cancel out – the mathematical reflection of “UV-divergences” in quantum field theory. This we turn to in Free quantum fields below.

These considerations generalize from the field bundle of the real scalar field to general field bundles (def. 34) as long as they are smooth vector bundles (def. 7):

Definition

(Fréchet topological vector space on spaces of smooth sections of a smooth vector bundle)

Let EfbΣE \overset{fb}{\to} \Sigma be a field bundle (def. 34) which is a smooth vector bundle (def. 7) over Minkowski spacetime (def. 23); hence, up to isomorphism, a trivial vector bundle as in example 9.

On its real vector space Γ Σ(E)\Gamma_\Sigma(E) of smooth sections consider the seminorms indexed by a compact subset KΣK \subset \Sigma and a natural number kk \in \mathbb{N} and given by

Γ Σ(E) p K k [0,) Φ max|α|k(supxK| αΦ(x)|), \array{ \Gamma_\Sigma(E) &\overset{p_K^k}{\longrightarrow}& [0,\infty) \\ \Phi &\mapsto& \underset{ {\vert \alpha\vert} \leq k}{max} \left( \underset{x \in K}{sup} {\vert \partial^\alpha \Phi(x)\vert}\right) \,, }

where on the right we have the absolute values of the partial derivatives of Φ\Phi index by α\alpha (1) with respect to any choice of norm on the fibers.

This makes Γ Σ(E)\Gamma_\Sigma(E) a Fréchet topological vector space.

For KΣK \subset \Sigma any closed subset then the sub-space of sections

Γ Σ,K(E)Γ Σ(E) \Gamma_{\Sigma,K}(E) \hookrightarrow \Gamma_\Sigma(E)

of sections whose support is inside KK becomes a Fréchet topological vector spaces with the induced subspace topology, which makes these be closed subspaces.

Finally, the vector spaces of smooth sections with prescribed causal support (def. 31) are inductive limits of vector spaces Γ Σ,K(E)\Gamma_{\Sigma,K}(E) as above, and hence they inherit topological vector space structure by forming the corresponding inductive limit in the category of topological vector spaces. For instance

Γ Σ,cp(E)limKΣKcompactΓ Σ,K(E) \Gamma_{\Sigma,cp}(E) \;\coloneqq\; \underset{\underset{ {K \subset \Sigma} \atop {K\, \text{compact}} }{\longrightarrow}}{\lim} \Gamma_{\Sigma,K}(E)

etc.

(Bär 14, 2.1)

Definition

(distributional sections)

Let EfbΣE \overset{fb}{\to} \Sigma be a smooth vector bundle (def. 7) over Minkowski spacetime (def. 23).

The vector spaces of smooth sections with restricted support from def. 31 structures of topological vector spaces via def. 73. We denote the dual topological vector spaces by

Γ Σ(E *)(Γ Σ,cp(E)) *. \Gamma'_{\Sigma}( E ^*) \;\coloneqq\; (\Gamma_{\Sigma,cp}(E))^* \,.

This is called the space of distributional sections of the bundle E˜ *\tilde{E}^*.

The support of a distributional section supp(u)supp(u) is the set of points in Σ\Sigma such that for every neighbourhood of that point uu does not vanish on all sections with support in that neighbourhood.

Imposing the same restrictions to the supports of distributional sections as in def. 31, we have the following subspaces of distributional sections:

Γ Σ,cp(E *),Γ Σ,±cp(E *),Γ Σ,scp(E *),Γ Σ,fcp(E *),Γ Σ,pcp(E *),Γ Σ,tcp(E *)Γ Σ(E *). \Gamma'_{\Sigma,cp}(E^\ast) , \Gamma'_{\Sigma,\pm cp}(E^\ast) , \Gamma'_{\Sigma,scp}(E^\ast) , \Gamma'_{\Sigma,fcp}(E^\ast) , \Gamma'_{\Sigma,pcp}(E^\ast) , \Gamma'_{\Sigma,tcp}(E^\ast) \;\subset\; \Gamma'_{\Sigma}(E^\ast) .

(Sanders 13, Bär 14)

As before in prop. 38 the actual smooth sections yield examples of distributional sections, and all distributional sections arise as limits of integrations against smooth sections:

Proposition

(non-singular distributional sections)

Let EfbΣE \overset{fb}{\to} \Sigma be a smooth vector bundle over Minkowski spacetime and let s{cp,±cp,scp,tcp}s \in \{cp, \pm cp, scp, tcp\} be any of the support conditions from def. 31.

Then the operation of regarding a compactly supported smooth section of the dual vector bundle as a functional on sections with this support property is a dense subspace inclusion into the topological vector space of distributional sections from def. 74:

Γ Σ,cp(E˜) Au ()A Γ Σ,s(E) b (ΦΣb(x)Φ(x)dvol Σ(x)) \array{ \Gamma_{\Sigma,cp}(\tilde E) &\overset{\phantom{A}u_{(-)}\phantom{A} }{\hookrightarrow}& \Gamma'_{\Sigma,s}(E) \\ b &\mapsto& \left( \Phi \mapsto \underset{\Sigma}{\int} b(x) \cdot \Phi(x) \, dvol_\Sigma(x) \right) }

(Bär 14, lemma 2.15)

Proposition

(distribution dualities with causally restricted supports)

Let EfbΣE \overset{fb}{\to} \Sigma be a smooth vector bundle (def. 7) over Minkowski spacetime (def. 23).

Then there are the following isomorphisms of topological vector spaces between a) dual spaces of spaces of sections with restricted causal support (def. 31) and equipped with the topology from def. 73 and b) spaces of distributional sections with restricted supports, according to def. 74:

Γ Σ,cp(E) * Γ Σ(E *), Γ Σ,+cp(E) * Γ Σ,fcp(E *), Γ Σ,cp(E) * Γ Σ,pcp(E *), Γ Σ,scp(E) * Γ Σ,tcp(E *), Γ Σ,fcp(E) * Γ Σ,+cp(E *), Γ Σ,pcp(E) * Γ Σ,cp(E *), Γ Σ,tcp(E) * Γ Σ,scp(E *), Γ Σ(E) * Γ Σ,cp(E *). \begin{aligned} \Gamma_{\Sigma,cp}(E)^* &\simeq \Gamma'_{\Sigma}(E^\ast) , \\ \Gamma_{\Sigma,+cp}(E)^* &\simeq \Gamma'_{\Sigma,fcp}(E^\ast) , \\ \Gamma_{\Sigma,-cp}(E)^* &\simeq \Gamma'_{\Sigma,pcp}(E^\ast) , \\ \Gamma_{\Sigma,scp}(E)^* &\simeq \Gamma'_{\Sigma,tcp}(E^\ast) , \\ \Gamma_{\Sigma,fcp}(E)^* &\simeq \Gamma'_{\Sigma,+cp}(E^\ast) , \\ \Gamma_{\Sigma,pcp}(E)^* &\simeq \Gamma'_{\Sigma,-cp}(E^\ast) , \\ \Gamma_{\Sigma,tcp}(E)^* &\simeq \Gamma'_{\Sigma,scp}(E^\ast) , \\ \Gamma_{\Sigma}(E)^* &\simeq \Gamma'_{\Sigma,cp}(E^\ast) . \end{aligned}

(Sanders 13, thm. 4.3, Bär 14, lem. 2.14)

The concept of linear observables naturally generalizes to that of multilinear observables:

Definition

(quadratic off-shell observables)

Let (E,L)(E,\mathbf{L}) be a Lagrangian field theory (def. 60) over a spacetime Σ\Sigma whose field bundle EE (def. 34) is a super vector bundle.

The external tensor product of vector bundles of the field bundle EfbΣE \overset{fb}{\to} \Sigma with itself, denoted

EEΣ×Σ E \boxtimes E \overset{}{\to} \Sigma \times \Sigma

is the vector bundle over the Cartesian product Σ×Σ\Sigma \times \Sigma, of spacetime with itself, whose fiber over a pair of points (x 1,x 2)(x_1,x_2) is the tensor product E x 1E x 2E_{x_1} \otimes E_{x_2} of the corresponding field fibers.

Given a field history, hence a section ϕΓ Σ(E)\phi \in \Gamma_\Sigma(E) of the field bundle, there is then the induced section ϕϕΓ Σ×Σ(EE)\phi \boxtimes \phi \in \Gamma_{\Sigma \times \Sigma}(E \boxtimes E).

We say that an off-shell observable

A:Γ Σ(E) A \;\colon\; \Gamma_\Sigma(E) \longrightarrow \mathbb{C}

is quadratic if it comes from a “bilinear observable”, namely a smooth function on the space of sections of the external tensor product of the field bundle with itself

B:Γ Σ×Σ(EE) δ ELL=0, B \;\colon\; \Gamma_{\Sigma \times \Sigma}(E \boxtimes E)_{\delta_{EL}\mathbf{L} = 0} \longrightarrow \mathbb{C} \,,

as

A(Φ)=B(Φ,Φ). A(\Phi) = B(\Phi,\Phi) \,.

More explicitly: By prop. 37 the quadratic observable AA is given by a compactly supported distribution of two variables which in the notation of remark 19 comes from a matrix of generalized functions β a 1a 2(Σ×Σ,EE)\beta_{a_1 a_2} \in \mathcal{E}'(\Sigma \times \Sigma, E \boxtimes E) as

A(Φ)= Σ×ΣΦ a 1(x 1)β a 1a 2(x 1,x 2)Φ a 2(x 2)dvol Σ(x 1)dvol Σ(x 2). A(\Phi) \;=\; \int_{\Sigma \times \Sigma} \Phi^{a_1}(x_1) \beta_{a_1 a_2}(x_1,x_2) \Phi^{a_2}(x_2)\, dvol_\Sigma(x_1) dvol_\Sigma(x_2) \,.

This notation makes manifest how the concept of quadratic observables is a generalization of that of quadratic forms coming from bilinear forms.

Definition

(polynomial off-shell observables)

Let (E,L)(E,\mathbf{L}) be a Lagrangian field theory (def. 60) over a spacetime Σ\Sigma whose field bundle EE (def. 34) is a super vector bundle.

An off-shell observable (def. 71)

A:Γ Σ(E) δ ELL=0 A \;\colon\; \Gamma_\Sigma(E)_{\delta_{EL}\mathbf{L} = 0} \longrightarrow \mathbb{C}

is polynomial if it is the sum of a constant, and a linear observable (def. 72), and a quadratic observable (def. 75) and so on:

A(Φ) =+α =+ ΣΦ a(x)α a(x)dvol Σ(x) =+ Σ 2Φ a 1(x 1)Φ a 2(x 2)α a 1a 2(x 1,x 2)dvol Σ(x 1)dvol Σ(x 2) =+ Σ 3Φ a 1(x 1)Φ a 2(x 2)Φ a 3(x 3)α a 1a 2a 3(x 1,x 2,x 3)dvol Σ(x 1)dvol Σ(x 2)dvol Σ(x 3) =+. \begin{aligned} A(\Phi) & = \phantom{+} \alpha \\ & \phantom{=} + \int_{\Sigma} \Phi^a(x) \alpha_a(x) \, dvol_\Sigma(x) \\ & \phantom{=} + \int_{\Sigma^2} \Phi^{a_1}(x_1) \Phi^{a_2}(x_2) \alpha_{a_1 a_2}(x_1, x_2) \, dvol_\Sigma(x_1) dvol_\Sigma(x_2) \\ & \phantom{=} + \int_{\Sigma^3} \Phi^{a_1}(x_1) \Phi^{a_2}(x_2) \Phi^{a_3}(x_3) \alpha_{a_1 a_2 a_3}(x_1,x_2,x_3) \, dvol_\Sigma(x_1) dvol_\Sigma(x_2) dvol_\Sigma(x^3) \\ & \phantom{=} + \cdots \,. \end{aligned}

In summary, the above establishes that the Schwartz theory of (compactly supported) distributions neatly applies to characterize smooth polynomial observables on the diffeological space of field histories for a field bundle which is a vector bundle.

\,

Polynomial on-shell Observables and Distributional solutions to PDEs

While every off-shell observable induces an on-shell observables simply by restriction (78), different off-shell observables may restrict to the same on-shell observale. It is therefore useful to find a condition on off-shell observables that makes them equivalent to on-shell observables under restriction. Here we discuss in the case of sufficiently well behaved free field equations of motion – namely Green hyperbolic differential equations, def. 79 below – that this on-shell condition on the linear off-shell observables (def. 72) is that they are distributional solutions to the formal adjoint to the equations of motion, under their identification with distributions via prop. 37.

While in general the equations of motion are not Green hyperbolic – namely not in the presence of implicit infinitesimal gauge symmetries discussed in Gauge symmetries below – it turns out that up to a suitable notion of equivalence they are equivalent to those that are, this we discuss in Gauge fixing below.

\,

Definition

(derivatives of distributions and distributional solutions of PDEs)

Given a pair of formally adjoint differential operators P,P *:Γ Σ(E)Γ Σ(E *)P, P^\ast \colon \Gamma_\Sigma(E) \to \Gamma_\Sigma(E^\ast) (def. 58) then the distributional derivative of a distributional section uΓ Σ(E)u \in \Gamma'_\Sigma(E) (def. 74) by PP is the distributional section PuΓ Σ(E *)P u \in \Gamma'_\Sigma(E^\ast)

Puu(P *()):Γ Σ,cp(E *). P u \;\coloneqq\; u(P^\ast(-)) \;\colon\; \Gamma_{\Sigma,cp}(E^\ast) \,.

If

Pu=0Γ Σ(E *) P u = 0 \;\in\; \Gamma'_\Sigma(E^\ast)

then we say that uu is a distributional solution (or generalized solution) of the homogeneous differential equation defined by PP.

Example

(ordinary PDE solutions are generalized solutions)

Let EfbΣE \overset{fb}{\to} \Sigma be a smooth vector bundle over Minkowski spacetime and let P,P *:Γ Σ(E)Γ Σ(E *)P, P^\ast \colon \Gamma_\Sigma(E) \to \Gamma_\Sigma(E^\ast) be a pair of formally adjoint differential operators.

Then for every non-singular distributional section u ΦΓ Σ(E *)u_{\Phi} \in \Gamma'_{\Sigma}(E^\ast) coming from an actual smooth section ΦΓ Σ(E)\Phi \in \Gamma_\Sigma(E) via prop. 39 the derivative of distributions (def. 77) is the distributional section induced from the ordinary derivative of smooth functions:

Pu Φ=u PΦ. P u_\Phi \;=\; u_{P \Phi} \,.

In particular u Φu_\Phi is a distributional solution to the PDE precisely if Φ\Phi is an ordinary solution:

Pu Φ=0AAAAAAPΦ=0. P u_\Phi \;=\; 0 \phantom{AAA} \Leftrightarrow \phantom{AAA} P \Phi = 0 \,.
Proof

For all bΓ Σ,cp(E)b \in \Gamma_{\Sigma,cp}(E) we have

(Pu Φ)(b) =u Φ(P *b) =uP *bdvol =(Pu)bdvol =u PΦ(b) \begin{aligned} (P u_\Phi)(b) & = u_\Phi(P^\ast b) \\ & = \int u \cdot P^\ast b \, dvol \\ & = \int (P u) \cdot b \, dvol \\ & = u_{P \Phi}(b) \end{aligned}

where all steps are by the definitions except the third, which is by the definition of formally adjoint differential operator (def. 58), using that by the compact support of bb and the Stokes theorem (prop. 4) the term K(Φ,b)K(\Phi,b) in def. 58 does not contribute to the integral.

Definition

(advanced and retarded Green functions and causal Green function)

Let EfbΣE \overset{fb}{\to} \Sigma be a field bundle (def. 34) which is a vector bundle (def. 7) over Minkowski spacetime (def. 23). Let P:Γ Σ(E)Γ Σ(E *)P \;\colon\;\Gamma_\Sigma(E) \to \Gamma_\Sigma(E^\ast) be a differential operator (def. 56) on its space of smooth sections.

Then a linear map

G P,±:Γ Σ,cp(E *)Γ Σ,±cp(E) \mathrm{G}_{P,\pm} \;\colon\; \Gamma_{\Sigma, cp}(E^\ast) \longrightarrow \Gamma_{\Sigma, \pm cp}(E)

from spaces of smooth sections of compact support to spaces of sections of causally sourced future/past support (def. 31) is called an advanced or retarded Green function for PP, respectively, if

  1. for all ΦΓ Σ,cp(E 1)\Phi \in \Gamma_{\Sigma,cp}(E_1) we have

    (81)G P,±P(Φ)=Φ G_{P,\pm} \circ P(\Phi) = \Phi

    and

    (82)PG P,±(Φ)=Φ P \circ G_{P,\pm}(\Phi) = \Phi
  2. the support of G P,±(Φ)G_{P,\pm}(\Phi) is in the closed future cone or closed past cone of the support of Φ\Phi, respectively.

If the advanced/retarded Green functions G P±G_{P\pm} exists, then the difference

(83)G PG P,+G P, \mathrm{G}_P \coloneqq \mathrm{G}_{P,+} - \mathrm{G}_{P,-}

is called the causal Green function.

(e.g. Bär 14, def. 3.2, cor. 3.10)

Definition

(Green hyperbolic differential equation)

Let EfbΣE \overset{fb}{\to} \Sigma be a field bundle (def. 34) which is a vector bundle (def. 7) over Minkowski spacetime (def. 23).

A differential operator (def. 57)

P:Γ Σ(E)Γ Σ(E *) P \;\colon\; \Gamma_\Sigma(E) \longrightarrow \Gamma_{\Sigma}(E^\ast)

is called a Green hyperbolic differential operator if PP as well as its formal adjoint differential operator P *P^\ast (def. 58) admit advanced and retarded Green functions (def. 78).

(Bär 14, def. 3.2, Khavkine 14, def. 2.2)

The two archtypical examples of Green hyperbolic differential equations are the Klein-Gordon equation and the Dirac equation on Minkowski spacetime. For the moment we just cite the existence of the advanced and retarded Green functions for these, we will work these out in detail below in Propagators.

(e. g. Bär-Ginoux-Pfaeffle 07, Bär 14, example 3.3)

(Bär 14, corollary 3.15, example 3.16)

Example

(causal Green functions of formally adjoint Green hyperbolic differential operators are formally adjoint)

Let

P,P *:Γ Σ(E)Γ Σ(E *) P, P^\ast \;\colon\;\Gamma_\Sigma(E) \overset{}{\longrightarrow} \Gamma_\Sigma(E^\ast)

be a pair of Green hyperbolic differential operators (def. 79) which are formally adjoint (def. 58). Then also their causal Green functions G P\mathrm{G}_P and G P *G_{P^\ast} (def. 78) are formally adjoint differential operators, up to a sign:

(G P) *=G P *. \left( \mathrm{G}_P \right)^\ast \;=\; - \mathrm{G}_{P^\ast} \,.

(Khavkine 14, (24), (25))

We did not require that the advanced and retarded Green functions of a Green hyperbolic differential operator are unique; in fact this is automatic:

(Bär 14, cor. 3.12

Moreover we did not require that the advanced and retarded Green functions of a Green hyperbolic differential operator come from integral kernels (“propagators”). This, too, is automatic:

Proposition

(causal Green functions of Green hyperbolic differential operators are continuous linear maps)

Given a Green hyperbolic differential operator PP (def. 79), the advanced, retarded and causal Green functions of PP (def. 78) are continuous linear maps with respect to the topological vector space structure from def. 73 and also have a unique continuous extension to the spaces of sections with larger support (def. 31) as follows:

G P,+ :Γ Σ,pcp(E˜ *)Γ Σ,pcp(E), G P, :Γ Σ,fcp(E˜ *)Γ Σ,fcp(E), G P :Γ Σ,tcp(E˜ *)Γ Σ(E), \begin{aligned} \mathrm{G}_{P,+} &\;\colon\; \Gamma_{\Sigma, pcp}(\tilde E^\ast) \longrightarrow \Gamma_{\Sigma, pcp}(E) , \\ \mathrm{G}_{P,-} &\;\colon\; \Gamma_{\Sigma, fcp}(\tilde E^\ast) \longrightarrow \Gamma_{\Sigma, fcp}(E) , \\ \mathrm{G}_{P} &\;\colon\; \Gamma_{\Sigma, tcp}(\tilde E^\ast) \longrightarrow \Gamma_{\Sigma}(E) , \end{aligned}

such that we still have the relation

G P=G P,+G P, \mathrm{G}_P = \mathrm{G}_{P,+} - \mathrm{G}_{P,-}

and

PG P,±=G P,±P=id P \circ \mathrm{G}_{P,\pm} = \mathrm{G}_{P,\pm} \circ P = id

and

suppG P,±(α˜ *)J ±(suppα˜ *). supp \mathrm{G}_{P,\pm}(\tilde{\alpha}^*) \subseteq J^\pm(supp \tilde{\alpha}^*) \,.

By the Schwartz kernel theorem the continuity of G ±,G\mathrm{G}_{\pm}, \mathrm{G} implies that there are integral kernels

Δ ±Γ Σ×Σ(E ΣE) \Delta_{\pm} \;\in\; \Gamma'_{\Sigma \times \Sigma}( E \boxtimes_\Sigma E )

such that, in the notation of generalized functions,

(G ±α *)(x)=ΣΔ ±(x,y)α *(y)dvol Σ(y). (G_{\pm} \alpha^\ast)(x) \;=\; \underset{\Sigma}{\int} \Delta_\pm(x,y) \cdot \alpha^\ast(y) \, dvol_\Sigma(y) \,.

These integral kernels are called the advanced and retarded propagators. Similarly the combination

(84)ΔΔ +Δ \Delta \;\coloneqq\; \Delta_+ - \Delta_-

is called the causal propagator.

(Bär 14, thm. 3.8, cor. 3.11)

We now come to the main theorem on polynomial observables:

Lemma

(exact sequence of Green hyperbolic differential operator)

Let Γ Σ(E)PΓ Σ(E *)\Gamma_\Sigma(E) \overset{P}{\longrightarrow} \Gamma_\Sigma(E^\ast) be a Green hyperbolic differential operator (def. 79) with causal Green function G\mathrm{G} (def. 79). Then the sequences

(85)0 Γ Σ,cp(E) P Γ Σ,cp(E *) G P Γ Σ,scp(E) P Γ Σ,scp(E *) 0 0 Γ Σ,tcp(E) P Γ Σ,tcp(E *) G P Γ Σ(E) P Γ Σ(E *) 0 \array{ 0 &\to& \Gamma_{\Sigma,cp}(E) &\overset{P}{\longrightarrow}& \Gamma_{\Sigma,cp}(E^\ast) &\overset{\mathrm{G}_P}{\longrightarrow}& \Gamma_{\Sigma,scp}(E) &\overset{P}{\longrightarrow}& \Gamma_{\Sigma,scp}(E^\ast) &\to& 0 \\ \\ 0 &\to& \Gamma_{\Sigma,tcp}(E) &\overset{P}{\longrightarrow}& \Gamma_{\Sigma,tcp}(E^\ast) &\overset{\mathrm{G}_P}{\longrightarrow}& \Gamma_{\Sigma}(E) &\overset{P}{\longrightarrow}& \Gamma_{\Sigma}(E^\ast) &\to& 0 }

of these operators restricted to functions with causally restricted supports as indicated (def. 31) are exact sequences of topological vector spaces and continuous linear maps between them.

Under passing to dual spaces and using the isomorphisms of spaces of distributional sections (def. 74) from prop. 40 this yields the following dual exact sequence of topological vector spaces and continuous linear map between them:

(86)0 Γ Σ,tcp(E) P * Γ Σ,tcp(E *) G P * Γ Σ(E) P * Γ Σ(E *) 0 0 Γ Σ,cp(E) P * Γ Σ,cp(E *) G P * Γ Σ,scp(E) P * Γ Σ,scp(E *) 0 \array{ 0 &\to& \Gamma'_{\Sigma,tcp}(E) &\overset{P^*}{\longrightarrow}& \Gamma'_{\Sigma,tcp}(E^\ast) &\overset{-\mathrm{G}_{P^*}}{\longrightarrow}& \Gamma'_{\Sigma}(E) &\overset{P^*}{\longrightarrow}& \Gamma'_{\Sigma}(E^\ast) &\to& 0 \\ \\ 0 &\to& \Gamma'_{\Sigma,cp}(E) &\overset{P^*}{\longrightarrow}& \Gamma'_{\Sigma,cp}(E^\ast) &\overset{-\mathrm{G}_{P^*}}{\longrightarrow}& \Gamma'_{\Sigma,scp}(E) &\overset{P^*}{\longrightarrow}& \Gamma'_{\Sigma,scp}(E^\ast) &\to& 0 }

This is due to Igor Khavkine, based on (Khavkine 14, prop. 2.1); for proof see at Green hyperbolic differential operator this lemma.

Corollary

(on-shell space of field histories for Green hyperbolic free field theories)

Let (E,L)(E,\mathbf{L}) be a free field theory Lagrangian field theory (def. 43) whose Euler-Lagrange equation of motion PΦ=0P \Phi = 0 is Green hyperbolic (def. 79).

Then the on-shell space of field histories (or of field histories with spatially compact support, def. 31) is, as a vector space, linearly isomorphic to the quotient space of compactly supported sections (or of temporally compactly supported sections, def. 31) by the image of the differential operator PP, and this isomorphism is given by the causal Green function G P\mathrm{G}_P (83)

(87)Γ Σ,tcp(E *)/im(P) AG PA ker(P)=Γ Σ(E) δ ELL=0 Γ Σ,cp(E *)/im(P) AG PA ker scp(P)=Γ Σ,scp(E) δ ELL=0. \array{ \Gamma_{\Sigma,tcp}(E^\ast)/im(P) &\underoverset{\simeq}{\phantom{A}\mathrm{G}_P \phantom{A}}{\longrightarrow}& ker(P) \;=\; \Gamma_{\Sigma}(E)_{\delta_{EL}\mathbf{L} = 0} \\ \Gamma_{\Sigma,cp}(E^\ast)/im(P) &\underoverset{\simeq}{\phantom{A}\mathrm{G}_P\phantom{A}}{\longrightarrow}& ker_{scp}(P) \;=\; \Gamma_{\Sigma,scp}(E)_{\delta_{EL}\mathbf{L} = 0} \,. }
Proof

This is a direct consequence of the exactness of the sequence (85) in lemma 3.

We spell this out for the statement for Γ Σ,scp(E) δ ELL=0\Gamma_{\Sigma,scp}(E)_{\delta_{EL} \mathbf{L} = 0}, which follows from the first line in (85), the first statement similarly follows from the second line of (85):

First the on-shell space of field histories is the kernel of PP, by definition of free field theory (def. 43)

Γ Σ,scp(E) δ ELL=0=ker scp(P). \Gamma_{\Sigma,scp}(E)_{\delta_{EL} \mathbf{L} = 0} \;=\; ker_{scp}(P) \,.

Second, exactness of the sequence (85) at Γ Σ,scp(E)\Gamma_{\Sigma,scp}(E) means that the kernel ker scp(P)ker_{scp}(P) of PP equals the image im(G P)im(\mathrm{G}_{P}). But by exactness of the sequence at Γ Σ,cp(E *)\Gamma_{\Sigma,cp}(E^\ast) it follows that G P\mathrm{G}_P becomes injective on the quotient space Γ Σ,cp(E) */im(P)\Gamma_{\Sigma,cp}(E)^\ast/im(P). Therefore on this quotient space it becomes an isomorphism onto its image.

Remark

Under passing to dual vector spaces, the linear isomorphism in corollary 1 in turn yields linear isomorphisms of the form

(88)(Γ Σ,cp(E *)/im(P)) * ()G P (ker scp(P)) * (Γ Σ(E *)/im(P)) * ()G P (ker(P)) *. \array{ \left(\Gamma_{\Sigma,cp}(E^\ast)/im(P)\right)^\ast &\underoverset{\simeq}{(-)\circ \mathrm{G}_P}{\longleftarrow}& \left(ker_{scp}(P)\right)^\ast \\ \left(\Gamma_\Sigma(E^\ast)/im(P)\right)^\ast &\underoverset{\simeq}{(-)\circ \mathrm{G}_P }{\longleftarrow}& \left(ker(P)\right)^\ast } \,.

Except possibly for the issue of continuity this says that the linear on-shell observables (def. 72) of a Green hyperbolic free field theory are equivalently those linear off-shell observables which are generalized solutions of the formally dual equation of motion according to def. 77.

That this remains true also for topological vector space structure follows with the dual exact sequence (86). This is the statement of prop. 43 below.

Proposition

(distributional sections on a Green hyperbolic solution space are the generalized PDE solutions)

Let P,P*:Γ Σ(E)Γ Σ(E *)P, P \ast \;\colon\; \Gamma_\Sigma(E) \overset{}{\longrightarrow} \Gamma_\Sigma(E^\ast) be a pair of Green hyperbolic differential operators (def. 79) which are formally adjoint (def. 58).

Then a continuous linear functional on the solution space

u sol(ker(P)) * u_{sol} \in \left(ker(P)\right)^\ast

is equivalently a distributional section (def. 74) whose support is spacelike compact (def. 31, prop. 40)

uΓ Σ,scp(E *) u \in \Gamma'_{\Sigma,scp}(E^\ast)

and which is a distributional solution (def. 77) to the differential equation

P *u=0. P^\ast u = 0 \,.

Similarly, a continuous linear functional on the subspace of solutions that have spatially compact support (def. 31)

u sol(ker(P) scp) * u_{sol} \in \left(ker(P)_{scp}\right)^\ast

is equivalently a distributional section (def. 74) without constraint on its distributional support

uΓ Σ(E *) u \in \Gamma'_{\Sigma}(E^\ast)

and which is a distributional solution (def. 77) to the differential equation

P *u=0. P^\ast u = 0 \,.

Moreover, these linear isomorphisms are both given by composition with the causal Green function G\mathrm{G} (def. 78):

(ker(P)) * ()G {uΓ Σ,scp(E *)|P *u=0} (ker scp(P)) * ()G {uΓ Σ(E *)|P *u=0}. \array{ \left(ker(P)\right)^\ast &\underoverset{\simeq}{(-)\circ \mathrm{G}}{\longrightarrow}& \left\{ u \in \Gamma'_{\Sigma,scp}(E^\ast) \,\vert\, P^\ast u = 0 \right\} \\ \left(ker_{scp}(P)\right)^\ast &\underoverset{\simeq}{(-)\circ \mathrm{G}}{\longrightarrow}& \left\{ u \in \Gamma'_{\Sigma}(E^\ast) \,\vert\, P^\ast u = 0 \right\} } \,.

This follows from the exact sequence in lemma 3. For details of the proof see at Green hyperbolic differential operator this prop., due to Igor Khavkine.

In conclusion we have found the following:

Theorem

(linear observables of Green free field theory are the distributional solutions to the formally adjoint equations of motion)

Let (E,L)(E,\mathbf{L}) be a Lagrangian free field theory (def. 62) which is a free field theory (def. 62) whose Euler-Lagrange differential equation of motion PΦ=0P \Phi = 0 (def. 61) is Green hyperbolic (def. 79), such as the Klein-Gordon equation (example 63) or the Dirac equation (example 64). Then:

  1. The linear off-shell observables (def. 72) are equivalently the compactly supported distributional sections (def. 74) of the dual vector bundle (def. 8) of the field bundle:

    LinObs(E)Γ Σ,cp(E *) LinObs(E) \;\simeq\; \Gamma'_{\Sigma,cp}(E^\ast)
  2. The linear on-shell observables (def. 72) are equivalently those spacelike compactly supported compactly distributional sections (def. 74) which are distributional solutions of the formally adjoint equations of motion (def. 58), and this isomorphism is exhibited by precomposition with the causal propagator G\mathrm{G}:

    LinObs(E,L)AG PA{AΓ Σ,scp(E *)|P *A=0} LinObs(E,\mathbf{L}) \;\underoverset{\simeq}{\phantom{A} \mathrm{G}_P \phantom{A}}{\longleftarrow}\; \left\{ A \in \Gamma'_{\Sigma,scp}(E^\ast) \;\vert\; P^\ast A = 0 \right\}

    Similarly the linear on-shell observables on spacelike compactly supported on-shell field histories (79) are equivalently the distributional solutions without constraint on their support:

    LinObs(E scp,L)AG PA{AΓ Σ(E *)|P *A=0} LinObs(E_{scp},\mathbf{L}) \;\underoverset{\simeq}{\phantom{A} \mathrm{G}_P \phantom{A}}{\longleftarrow}\; \left\{ A \in \Gamma'_{\Sigma}(E^\ast) \;\vert\; P^\ast A = 0 \right\}
Proof

The first statement follows with prop. 37 applied componentwise. The same proof applies verbatim to the subspace of solutions, showing that LinObs(E,L)(ker(P)) *LinObs(E,\mathbf{L}) \simeq \left( ker(P)\right)^\ast, with the dual topological vector space on the right. With this the second statement follows by prop. 43.

We will be interested in those linear observables which under the identification from theorem 1 correspond to the non-singular distributions (because on these the Poisson-Peierls bracket of the theory is defined, theorem 2 below):

Definition

(regular linear field observables and observable-valued distributions)

Let (E,L)(E,\mathbf{L}) be a free Lagrangian field theory (def. 62) whose Euler-Lagrange equations of motion (prop. 81) is Green hyperbolic (def. 79).

Define the regular linear field observables among the linear on-shell observables (def. 72) to be the non-singular distributions on the on-shell space of field histories, hence the image

LinObs(E scp,L)LinObs(E scp,L) LinObs(E_{scp},\mathbf{L}) \hookrightarrow LinObs(E_{scp},\mathbf{L})

of the map

(89)Φ : Γ Σ,cp(E *) LinObs(E scp,L) Obs(E scp,L) α * (ΦΣα a *(x)Φ a(x)dvol Σ(x)) \array{ \mathbf{\Phi} &\colon& \Gamma_{\Sigma,cp}(E^\ast) &\longrightarrow& LinObs(E_{scp},\mathbf{L}) &\hookrightarrow& Obs(E_{scp},\mathbf{L}) \\ && \alpha^\ast &\mapsto& \left( \Phi \mapsto \underset{\Sigma}{\int} \alpha^\ast_a(x) \cdot \Phi^a(x) \, dvol_\Sigma(x) \right) }

By lemma 3 every ΦΓ Σ,scp(E)\Phi \in \Gamma_{\Sigma,scp}(E) is in the image of G\mathrm{G} and by example 65 this implies that the kernel of this map is the image of P:Γ Σ,cp(E)Γ Σ,cp(E *)P \;\colon\; \Gamma_{\Sigma,cp}(E) \to \Gamma_{\Sigma,cp}(E^\ast):

(90)LinObs(E scp,L) regΓ Σ,scp(E *)/im(P). LinObs(E_{scp},\mathbf{L})^{reg} \;\simeq\; \Gamma_{\Sigma,scp}(E^\ast)/im(P) \,.

The point-evaluation field observables Ψ a(x)\mathbf{\Psi}^a(x) (example 60) are linear observables (example 61) but far from being regular (89) (except in spacetime dimension p+1=0+1p +1 = 0+1). But the regular observables are precisely the averages (“smearings”) of these point evaluation observables against compactly supported weights.

Viewed this way, the defining inclusion of the regular linear observables (89) is itself an observable valued distribution

(91)Φ : Γ Σ,cp(E *) LinObs(E,L) α * Σα a *(x)Φ a(x)dvol Σ(x) \array{ \mathbf{\Phi} &\colon& \Gamma_{\Sigma,cp}(E^\ast) &\hookrightarrow& LinObs(E,\mathbf{L}) \\ && \alpha^\ast &\mapsto& \underset{\Sigma}{\int} \alpha^\ast_a(x) \mathbf{\Phi}^a(x)\, dvol_\Sigma(x) }

which to a “smearing function” α *\alpha^\ast assigns the observable which is the field observable smeared by (i.e. averaged against) that smearing function.

Below in Free quantum fields we discuss how the polynomial Poisson algebra of regular polynomial observables of a free field theory may be deformed to a non-commutative algebra of quantum observables. Often this may be represented by linear operators acting on some Hilbert space. In this case then Φ\mathbf{\Phi} above becomes a continuous linear functional from Γ Σ,cp(E)\Gamma_{\Sigma,cp}(E) to a space of linear operators on some Hilbert space. As such it is then called an operator-valued distribution.

\,

Local observables

We now discuss the sub-class of those observables which are “local”.

Definition

(spacetime support)

Let EfbΣE \overset{fb}{\to} \Sigma be a field bundle over a spacetime Σ\Sigma (def. 34), with induced jet bundle J Σ (E)J^\infty_\Sigma(E)

For every subset SΣS \subset \Sigma let

J Σ (E)| S ι S J Σ (E) (pb) S Σ \array{ J^\infty_\Sigma(E)\vert_S &\overset{\iota_S}{\hookrightarrow}& J^\infty_\Sigma(E) \\ \downarrow &(pb)& \downarrow \\ S &\hookrightarrow& \Sigma }

be the corresponding restriction of the jet bundle of EE.

The spacetime support supp Σ(A)supp_\Sigma(A) of a differential form AΩ (J Σ (E))A \in \Omega^\bullet(J^\infty_\Sigma(E)) on the jet bundle of EE is the topological closure of the maximal subset SΣS \subset \Sigma such that the restriction of AA to the jet bundle restrited to this subset vanishes:

supp Σ(A)Cl({xΣ|ι {x} *A=0}) supp_\Sigma(A) \coloneqq Cl( \{ x \in \Sigma | \iota_{\{x\}^\ast A = 0} \} )

We write

Ω Σ,cp r,s(E){AΩ Σ r,s(E)|supp Σ(A)is compact}Ω Σ r,s(E) \Omega^{r,s}_{\Sigma,cp}(E) \coloneqq \left\{ A \in \Omega^{r,s}_\Sigma(E) \;\vert\; supp_\Sigma(A) \, \text{is compact} \right\} \;\hookrightarrow\; \Omega^{r,s}_\Sigma(E)

for the subspace of differential forms on the jet bundle whose spacetime support is a compact subspace.

Definition

(transgression of variational differential forms to space of field histories)

Let EfbΣE \overset{fb}{\to} \Sigma be a field bundle over a spacetime Σ\Sigma (def. 34). and let

Σ rΣ \Sigma_r \hookrightarrow \Sigma

be a submanifold of spacetime of dimension rr \in \mathbb{N}. Recall the space of field histories restricted to its infinitesimal neighbourhood, denoted Γ Σ r(E)\Gamma_{\Sigma_r}(E) (def. 34).

Then the operation of transgression of variational differential forms to Σ r\Sigma_r is the linear map

τ Σ r:Ω Σ,cp ,(E)Ω (Γ Σ r(E)) \tau_{\Sigma_r} \;\colon\; \Omega^{\bullet,\bullet}_{\Sigma,cp}(E) \overset{ }{\longrightarrow} \Omega^\bullet\left( \Gamma_{\Sigma_r}(E) \right)

that sends a variational differential form AΩ Σ,cp ,(E)A \in \Omega^{\bullet,\bullet}_{\Sigma,cp}(E) to the differential form τ Σ rΩ (Γ Σ r(E))\tau_{\Sigma_r} \in \Omega^\bullet(\Gamma_{\Sigma_r}(E)) (def. 37, example 32) which to a smooth family on field histories

Φ ()():U×N ΣΣ rE \Phi_{(-)}(-) \;\colon\; U \times N_\Sigma \Sigma_r \longrightarrow E

assigns the differential form given by first forming the pullback of differential forms along the family of jet prolongation j Σ (Φ ())j^\infty_\Sigma(\Phi_{(-)}) followed by the integration of differential forms over Σ r\Sigma_r:

(τ ΣA) Φ Σ r(j Σ (Φ ())) *AΩ (U). (\tau_{\Sigma}A)_\Phi \;\coloneqq\; \int_{\Sigma_r} (j^\infty_\Sigma(\Phi_{(-)}))^\ast A \;\in\; \Omega^\bullet(U) \,.
Remark

(transgression to dimension rr picks out horizontal rr-forms)

In def. 82 we regard integration of differential forms over Σ r\Sigma_r as an operation defined on differential forms of all degrees, which vanishes except on forms of degree rr, and hence transgression of variational differential forms to Σ r\Sigma_r vanishes except on the subspace

Ω Σ r,(E)Ω Σ ,(E) \Omega^{r,\bullet}_\Sigma(E) \;\subset\; \Omega^{\bullet,\bullet}_\Sigma(E)

of forms of horizontal degree rr.

Example

(adiabatically switched action functional)

Given a field bundle EfbΣE \overset{fb}{\longrightarrow} \Sigma, consider a local Lagrangian density (def. 60)

LΩ Σ p+1,0(E). \mathbf{L} \in \Omega^{p+1,0}_\Sigma(E) \,.

For any bump function bC cp (Σ)b \in C^\infty_{cp}(\Sigma), the transgression of bLb \mathbf{L} (def. 82) is called the action functional

𝒮 bLτ Σ(bL):Γ Σ(E) \mathcal{S}_b \mathbf{L} \coloneqq \tau_{\Sigma} \left( b \mathbf{L} \right) \;\colon\; \Gamma_\Sigma(E) \longrightarrow \mathbb{R}

induced by L\mathbf{L}, “adiabatically switched” by bb.

Specifically if the field bundle is a trivial vector bundle as in example 9, such that the Lagrangian density may be written in the form

L=L((x μ),(ϕ a),(ϕ ,μ a),)bdvol ΣΩ Σ,cp p+1,0(E). \mathbf{L} \;=\; L \left( (x^\mu), (\phi^a), (\phi^a_{,\mu}), \cdots \right) \, b dvol_\Sigma \;\in\; \Omega^{p+1,0}_{\Sigma,cp}( E ) \,.

then its action functional takes a field history Φ\Phi to the value

𝒮 bL(Φ): ΣL(x,(Φ a(x)),(Φ ax μ(x)),)b(x)dvol Σ(x) \mathcal{S}_{b \mathbf{L}}(\Phi) \:\colon\; \int_\Sigma L \left( x, \left( \Phi^a(x) \right), \left(\frac{\partial \Phi^a}{\partial x^\mu}(x)\right), \cdots \right) \, b(x) dvol_\Sigma(x)
Proposition

(transgression compatible with variational derivative)

Let EfbΣE \overset{fb}{\to} \Sigma be a field bundle over a spacetime Σ\Sigma (def. 34) and let Σ rΣ\Sigma_r \hookrightarrow \Sigma be a submanifold possibly with boundary Σ rΣ r\partial \Sigma_r \hookrightarrow \Sigma_r. Write

Γ Σ r(E)()| Σ rΓ Σ r(E) \Gamma_{\Sigma_r}(E) \overset{(-)\vert_{\partial \Sigma_r}}{\longrightarrow} \Gamma_{\partial \Sigma_r}(E)

for the boundary restriction map.

Then the operation of transgression of variational differential forms (def. 82)

τ Σ:Ω Σ,cp ,(E)Ω (Γ Σ r(E)) \tau_{\Sigma} \;\colon\; \Omega^{\bullet,\bullet}_{\Sigma,cp}(E) \longrightarrow \Omega^\bullet\left(\Gamma_{\Sigma_r}(E)\right)

is compatible with the variational derivative δ\delta and with the total spacetime derivative dd in the following way:

  1. On variational forms that are in the image of the total spacetime derivative a transgressive variant of the Stokes' theorem (prop. 4) holds:

    τ Σ r(dα)=(()| Σ) *τ Σ r(α) \tau_{\Sigma_r}(d \alpha) \;=\; ((-)\vert_{\partial \Sigma})^\ast \tau_{\partial \Sigma_r}( \alpha)
  2. Transgression intertwines, up to a sign, the variational derivative δ\delta on variational differential forms with the plain de Rham differential on the space of field histories:

    τ Σ(δα)=(1) p+1dτ Σ(α). \tau_{\Sigma}\left( \delta \alpha \right) \;=\; (-1)^{p+1}\, d \,\tau_{\Sigma}(\alpha) \,.
Proof

Regarding the first statement, consider a horizontally exact variational form

dαΩ Σ,cp r,s(E). d \alpha \in \Omega^{r,s}_{\Sigma,cp}(E) \,.

By prop. 20 the pullback of this form along the jet prolongation of fields is exact in the Σ\Sigma-direction:

(j Σ Φ ()) *(dα)=d Σ(j Σ Φ ()) *α, (j^\infty_\Sigma\Phi_{(-)})^\ast(d \alpha ) \;=\; d_\Sigma (j^\infty_\Sigma\Phi_{(-)})^\ast \alpha \,,

(where we write d=d U+d Σd = d_U + d_\Sigma for the de Rham differential on U×ΣU \times \Sigma). Hence by the ordinary Stokes' theorem (prop. 4) restricted to any Φ ():UΓ Σ r(E)\Phi_{(-)} \colon U \to \Gamma_{\Sigma_r}(E) with restriction ()| Σ rΦ ():UΓ Σ r(E)(-)\vert_{\partial \Sigma_r} \circ \Phi_{(-)} \colon U \to \Gamma_{\Sigma_r}(E) the relation

(Φ ()) *τ Σ r(dα) = Σ rd Σ r(j Σ Φ ()) *α = Σ r(j Σ Φ ()) *α = Σ r(j Σ (()| ΣΦ ())) *α =(()| ΣΦ ()) *τ Σ r(α) =(Φ ()) *(()| Σ r) *τ Σ r(α).. \begin{aligned} (\Phi_{(-)})^\ast \tau_{\Sigma_r}(d \alpha) & = \int_{\Sigma_r} d_{\Sigma_r} (j^\infty_\Sigma\Phi_{(-)})^\ast\alpha \\ & = \int_{\partial \Sigma_r} (j^\infty_\Sigma\Phi_{(-)})^\ast\alpha \\ & = \int_{\partial \Sigma_r} (j^\infty_\Sigma ( (-)\vert_\Sigma \circ \Phi_{(-)}) )^\ast\alpha \\ & = ( (-)\vert_\Sigma \circ \Phi_{(-)} )^\ast \tau_{\partial \Sigma_r}(\alpha) \\ & = (\Phi_{(-)})^\ast ((-)\vert_{\Sigma_r})^\ast \tau_{\partial \Sigma_r}(\alpha) \,. \end{aligned} \,.

Regarding the second statement: by the Leibniz rule for de Rham differential (product law of differentiation) it is sufficient to check the claim on variational derivatives of local coordinate functions

δϕ μ 1μ k abΩ Σ 0,1(E). \delta \phi^a_{\mu_1 \cdots \mu_k} b \in \Omega^{0,1}_\Sigma(E) \,.

The pullback of differential forms (prop. 2) along the jet prolongation j Σ (Φ ()):U×ΣJ Σ (E)j^\infty_\Sigma(\Phi_{(-)}) \colon U \times \Sigma \to J^\infty_\Sigma(E) has two contributions: one from the variation along Σ\Sigma, the other from variation along UU:

  1. By prop. 20, for fixed uUu \in U the pullback of δϕ μ 1μ k a\delta \phi^a_{\mu_1 \cdots \mu_k} along the jet prolongation vanishes.

  2. For fixed xΣx \in \Sigma, the pullback of the full de Rham differential dϕ μ 1μ k a\mathbf{d} \phi^a_{\mu_1\cdots \mu_k} is

    (Φ ()(x)) *(dϕ μ 1μ k a) =d U(Φ ()(x)) *(ϕ μ 1μ k a) =d U kΦ ()(x)x μ 1x μ k \begin{aligned} (\Phi_{(-)}(x))^\ast( \mathbf{d} \phi^a_{\mu_1\cdots \mu_k} ) & = d_U (\Phi_{(-)}(x))^\ast(\phi^a_{\mu_1\cdots \mu_k}) \\ & = d_U \frac{ \partial^k \Phi_{(-)}(x)}{\partial x^{\mu^1} \cdots \partial x^{\mu_k}} \end{aligned}

    (since the full de Rham differentials always commute with pullback of differential forms by prop. 2), while the pullback of the horizontal derivative dϕ μ 1μ k a=ϕ μ 1μ kμ k+1 adx μ k+1d \phi^a_{\mu_1\cdots \mu_k} = \phi^a_{\mu_1 \cdots \mu_{k} \mu_{k+1}} \mathbf{d}x^{\mu_{k+1}} vanishes at fixed xΣx \in \Sigma.

This implies over the given smooth family Φ ()\Phi_{(-)} that

τ Σ(δϕ ,μ 1μ k ab)| Φ () =τ Σ(d(ϕ ,μ 1μ k ab))| Φ ()τ Σ(d(ϕ ,μ 1μ k ab))| Φ ()=0 = Σd U(Φ ()) *(ϕ μ 1μ k ab) =(1) p+1d U Σ(Φ ()) *(ϕ μ 1μ k ab) =(1) p+1d Uτ Σ(Φ ()) *(ϕ μ 1μ k a). \begin{aligned} \tau_\Sigma\left( \delta \phi^a_{,\mu_1 \cdots \mu_k} b \right)\vert_{\Phi_{(-)}} & = \tau_\Sigma\left( \mathbf{d} ( \phi^a_{,\mu_1 \cdots \mu_k} b) \right) \vert_{\Phi_{(-)}} - \underset{ = 0 }{ \underbrace{ \tau_\Sigma \left( d (\phi^a_{,\mu_1 \cdots \mu_k} b) \right)\vert_{\Phi_{(-)}} }} \\ & = \int_\Sigma d_U (\Phi_{(-)})^\ast ( \phi^a_{\mu_1\cdots \mu_k} b ) \\ & = (-1)^{p+1} d_U \int_\Sigma (\Phi_{(-)})^\ast ( \phi^a_{\mu_1\cdots \mu_k} b ) \\ & = (-1)^{p+1} d_U \tau_{\Sigma}( \Phi_{(-)} )^\ast ( \phi^a_{\mu_1 \cdots \mu_k} ) \,. \end{aligned}

and since this holds covariantly for all smooth families Φ ()\Phi_{(-)}, this implies the claim.

Example

(variation of the action functional)

Given a Lagrangian field theory (E,L)(E,\mathbf{L}) (def. 60) then the derivative of its adiabatically switched action functional (def. 66) equals the transgression of the Euler-Lagrange variational derivative δ ELL\delta_{EL} \mathbf{L} (def. 22):

d𝒮 bL=τ Σ(bδ ELL). d \mathcal{S}_{b \mathbf{L}} \;=\; \tau_\Sigma( b \delta_{EL}\mathbf{L} ) \,.
Proof

By the second statement of prop. 44 we have

d𝒮 bL =τ Σ(δ(bL)), \begin{aligned} d \mathcal{S}_{b \mathbf{L}} & = \tau_\Sigma( \delta ( b \mathbf{L} ) ) \end{aligned} \,,

Moreover, by prop. 22 this is

=τ Σ(δ ELbL+dΘ BFV,b) =τ Σ(δ ELbL)+τ Σ(dΘ BFV,b)=0, \begin{aligned} \cdots & = \tau_\Sigma( \delta_{EL} b \mathbf{L} + d \Theta_{BFV,b} ) \\ & = \tau_\Sigma( \delta_{EL} b \mathbf{L} ) + \underset{= 0}{\underbrace{\tau_\Sigma( d \Theta_{BFV,b} )}} \end{aligned} \,,

where the second term vanishes by the first statement of prop. 44.

Proposition

(principle of extremal action)

Let (E,L)(E,\mathbf{L}) be a Lagrangian field theory (def. 60).

The de Rham differential d𝒮 bLd \mathcal{S}_{b\mathbf{L}} of the action functional (example 67) vanishes at a field history

ΦΓ Σ(E) \Phi \in \Gamma_\Sigma(E)

for all adiabatic switchings bC cp (Σ)b \in C^\infty_{cp}(\Sigma) constant on some subset 𝒪Σ\mathcal{O} \subset \Sigma (def. 33) on those smooth collections of field histories

Φ ():UΓ Σ(E) \Phi_{(-)} \;\colon\; U \longrightarrow \Gamma_\Sigma(E)

around Φ\Phi which, as functions on UU, are constant outside 𝒪\mathcal{O} (example 16, example 33) precisely if Φ\Phi solves the Euler-Lagrange equations of motion (def. 61):

(𝒪Σb| 𝒪=constΦ ()| Σ𝒪=const((Φ ()) *d𝒮 bL(Φ)=0))(j Σ (Φ) *(δ ELLδϕ a)=0). \left( \underset{ { {\mathcal{O} \subset \Sigma} \atop { b\vert_{\mathcal{O}} = const } } \atop { \Phi_{(-)}\vert_{\Sigma \setminus \mathcal{O}} = const } }{\forall} \left( (\Phi_{(-)})^\ast d \mathcal{S}_{b \mathbf{L}}(\Phi) = 0 \right) \right) \;\Leftrightarrow\; \left( j^\infty_\Sigma(\Phi)^\ast \left( \frac{\delta_{EL} L}{\delta \phi^a} \right) = 0 \right) \,.
Proof

By prop. 44 we have

(Φ ()) *d𝒮 bL= Σj Σ (Φ ()) *(δ ELbL). (\Phi_{(-)})^\ast d \mathcal{S}_{b \mathbf{L}} \;=\; \int_\Sigma j^\infty_\Sigma(\Phi_{(-)})^\ast ( \delta_{EL} b \mathbf{L} ) \,.

By the assumption on Φ ()\Phi_{(-)} it follows that after pullback to UU the switching function bb is constant, so that it commutes with the differentials:

(Φ ()) *d𝒮 bL= Σbj Σ (Φ ()) *(δ ELL). (\Phi_{(-)})^\ast d \mathcal{S}_{b \mathbf{L}} \;=\; \int_\Sigma b j^\infty_\Sigma(\Phi_{(-)})^\ast ( \delta_{EL} \mathbf{L} ) \,.

This vanishes at Φ\Phi for all Φ ()\Phi_{(-)} precisely if all components of j Σ (Φ ()) *(δ ELL)j^\infty_\Sigma(\Phi_{(-)})^\ast ( \delta_{EL} \mathbf{L} ) vanish, which is the statement of the Euler-Lagrange equations of motion.

Definition

(local observables)

Given a Lagrangian field theory (E,L)(E,\mathbf{L}) (def. 60) with on-shell space of histories Γ Σ(E) δ ELL=0\Gamma_\Sigma(E)_{\delta_{EL} \mathbf{L} = 0} (62) then the space

(92)Obs Σ(E)C (Γ Σ(E)) Obs_\Sigma(E) \;\coloneqq\; C^\infty( \Gamma_\Sigma(E) )

of observables is simply the space of complex-valued smooth functions

A:Γ Σ(E) δ ELL=0 A \;\colon\; \Gamma_\Sigma(E)_{\delta_{EL}\mathbf{L} = 0} \longrightarrow \mathbb{C}

on the on-shell space of field histories (62). This is a star-algebra under pointwise complex conjugation.

(That we consider functions with values in complex numbers instead of real numbers is a reflection of the superposition principle in quantum physics, more about this below.)

On the other hand the local observables are the horizontal p+1-forms

  1. of compact spacetime support (def. 81)

  2. modulo total spacetime derivatives

  3. restricted to the shell \mathcal{E}^\infty (49):

LocObs Σ(E)(Ω Σ,cp p+1,0(E)/(im(d)))| LocObs_\Sigma(E) \;\coloneqq\; \left(\Omega^{p+1,0}_{\Sigma,cp}(E)/(im(d))\right)\vert_{\mathcal{E}^\infty}

which we may identify with the subspace of all observables (92) on those that arise as the image under transgression of variational differential forms τ Σ\tau_\Sigma (def. 82) of local observables to functionals on the on-shell space of field histories (62):

LocObs Σ(E)τ ΣC (Γ Σ(E) δ ELL=0). LocObs_\Sigma(E) \overset{\tau_\Sigma}{\hookrightarrow} \in C^\infty\left( \Gamma_\Sigma(E)_{\delta_{EL}\mathbf{L} = 0} \right) \,.

This is a sub-vector space inside all observables which is in general not closed under the product of functions. We write

LocObs Σ C (Γ Σ(E)) δ ELL=0 \mathcal{F} \;\coloneqq\; \langle LocObs_\Sigma \rangle_{C^\infty\left(\Gamma_\Sigma(E)\right)_{\delta_{EL}\mathbf{L} = 0}}

for the smallest subalgebra of observables, under the pointwise product, that contains all the local observables. This is called the algebra of multilocal observables.

maybe better consider formal power series of observables around a background solution

Example

(local observables of the real scalar field)

Consider the field bundle of the real scalar field (example 10).

A typical example of local observables (def. 83) in this case is the “field amplitude averaged over a given spacetime region” determined by a bump function bC cp (Σ)b \in C^\infty_{cp}(\Sigma). On an on-shell field history Φ\Phi this observable takes as value the integral

τ Σ(bϕ)(Φ)= ΣΦ(x)b(x)dvol Σ(x). \tau_\Sigma(b \phi)(\Phi) \;=\; \int_\Sigma \Phi(x) b(x) dvol_\Sigma(x) \,.
Example

(local observables of the electromagnetic field)

Consider the field bundle for free electromagnetism on Minkowski spacetime Σ\Sigma.

Then for bC (Σ)b \in C^\infty(\Sigma) a bump function on spacetime, the transgression of the universal Faraday tensor (def. 36) against bb times the volume form is a local observable (def. 83), namely the field strength (20) of the electromagnetic field averaged over spacetime.

\,

Infinitesimal observables

The definition of observables in def. 71 and specifically of local observables in def. 83 uses explicit restriction to the shell, hence, by the principle of extremal action (prop. 45) to the “critical locus” of the action functional. Such critical loci are often hard to handle explicitly. It helps to consider a “homological resolution” that is given, in good circumstances, by the corresponding “derived critical locus”. These we consider in detail below in Reduced phase space. In order to have good control over these resolutions, we here consider the first perturbative aspect of field theory, namely we consider the restriction of local observables to just an infinitesimal neighbourhood of a background on-shell field history:

Definition

(local observables around infinitesimal neighbourhood of background on-shell field history)

Let (E,L)(E,\mathbf{L}) be a Lagrangian field theory (def. 60) whose field bundle EE is a trivial vector bundle (example 9) and whose Lagrangian density L\mathbf{L} is spacetime-independent (example 24). Let Σ×{φ}\Sigma \times \{\varphi\} \hookrightarrow \mathcal{E} be a constant section of the shell (56) as in example 24.

Then we write

LocObs Σ(E,φ) LocObs_\Sigma(E,\varphi)

for the restriction of the local observables (def. 83) to the fiberwise infinitesimal neighbourhood (example 27) of Σ×{φ}\Sigma \times \{\varphi\}.

Explicitly, this means the following:

First of all, by prop. 19 the dependence of the Lagrangian density L\mathbf{L} on the order of field derivatives is bounded by some kk \in \mathbb{N} on some neighbourhood of φ\varphi and hence, by the spacetime independence of L\mathbf{L}, on some neighbourhood of Σ×{φ}\Sigma \times \{\varphi\}.

Therefore we may restrict without loss to the order-kk jets. By slight abuse of notation we still write

J Σ k(E) \mathcal{E} \hookrightarrow J^k_\Sigma(E)

for the corresponding shell. It follows then that the restriction of the ring Ω Σ,cp 0,0(E)\Omega^{0,0}_{\Sigma,cp}(E) of smooth functions on the jet bundle to the infinitesimal neighbourhood (example 27) is equivalently the formal power series ring over C cp (Σ)C^\infty_{cp}(\Sigma) in the variables

((ϕ aφ a),(ϕ ,μ aφ ,μ a),,(ϕ ,μ 1μ k aφ ,μ 1μ k a)) ((\phi^a- \varphi^a), (\phi^a_{,\mu}- \varphi^a_{,\mu}), \cdots, (\phi^a_{,\mu_1 \cdots \mu_k} - \varphi^a_{,\mu_1 \cdots \mu_k}) )

We denote this by

(93)Ω Σ,cp 0,0(E,φ)C cp (Σ)[[(ϕ aφ a),(ϕ ,μ aφ ,μ a),,(ϕ ,μ 1μ k aφ ,μ 1μ k a)]]. \Omega^{0,0}_{\Sigma,cp}(E,\varphi) \;\coloneqq\; C^\infty_{cp}(\Sigma)\left[ \left[ (\phi^a - \varphi^a ), (\phi^a_{,\mu} -\varphi^a_{,\mu}), \cdots, (\phi^a_{,\mu_1 \cdots \mu_k}- \varphi^a_{,\mu_1 \cdots \mu_k}) \right] \right] \,.

A key consequence is that the further restriction of this ring to the shell \mathcal{E}^\infty (49) is now simply the further quotient ring by the ideal generated by the total spacetime derivatives of the components ELLδϕ a\frac{\partial_{EL}L}{\delta \phi^a} of the Euler-Lagrange form (prop. 22).

(94)Ω Σ,cp 0,0(E,φ)| Ω Σ,cp 0,0(E,φ)/(d kdx μ 1dx μ lδ ELLδϕ a) a{1,,s}l{1,,k}μ r{0,,p} =C cp (Σ)[[(ϕ aφ a),(ϕ ,μ aφ ,μ a),,(ϕ ,μ 1μ k aφ ,μ 1μ k a)]]/(d kdx μ 1dx μ lδ ELLδϕ a) a{1,,s}l{1,,k}μ r{0,,p}. \begin{aligned} \Omega^{0,0}_{\Sigma,cp}(E,\varphi)\vert_{\mathcal{E}} & \coloneqq \Omega^{0,0}_{\Sigma,cp}(E,\varphi) / \left( \frac{d^k}{ d x^{\mu_1} \cdots d x^{\mu_l}} \frac{\delta_{EL} L}{\delta \phi^a} \right)_{ { a \in \{1, \cdots, s\} } \atop { { l \in \{1, \cdots, k\} } \atop { \mu_r \in \{0, \cdots, p\} } } } \\ & = C^\infty_{cp}(\Sigma)\left[ \left[ (\phi^a - \varphi^a ), (\phi^a_{,\mu} -\varphi^a_{,\mu}), \cdots, (\phi^a_{,\mu_1 \cdots \mu_k}- \varphi^a_{,\mu_1 \cdots \mu_k}) \right] \right] / \left( \frac{d^k}{ d x^{\mu_1} \cdots d x^{\mu_l}} \frac{\delta_{EL} L}{\delta \phi^a} \right)_{ { a \in \{1, \cdots, s\} } \atop { { l \in \{1, \cdots, k\} } \atop { \mu_r \in \{0, \cdots, p\} } } } \end{aligned} \,.

Finally the local observables restricted to the infinitesimal neighbourhood is the module

(95)LocObs Σ(E,φ)(Ω Σ,cp 0,0(E,φ)| dvol Σ)/(im(d)). LocObs_\Sigma(E,\varphi) \;\simeq\; \left( \Omega^{0,0}_{\Sigma,cp}(E,\varphi)\vert_{\mathcal{E}} \langle dvol_\Sigma \rangle \right)/(im(d)) \,.

The space of local observables in def. 84 is the quotient of a formal power series algebra by the components of the Euler-Lagrange form and by the image of the horizontal spacetime de Rham differential. It is convenient to also conceive of the components of the Euler-Lagrange form as the image of a differential, for then the algebra of local observables obtaines a cohomological interpretation, which will lend itself to computation. This differential, whose image is the components of the Euler-Lagrange form, is called the BV-differential. We introduce this now first (def. 85 below) in a direct ad-hoc way. Further below we discuss the conceptual nature of this differential as part of the construction of the reduced phase space as a derived critical locus (example 101 below).

Definition

(local BV-complex of ordinary Lagrangian density)

Let (E,L)(E,\mathbf{L}) be a Lagrangian field theory (def. 60) whose field bundle EE is a trivial vector bundle (example 9) and whose Lagrangian density L\mathbf{L} is spacetime-independent (example 84). Let Σ×{φ} \Sigma \times \{\varphi\} \hookrightarrow \mathcal{E}^\infty be a constant section of the shell (56).

In correspondence with def. 84, write

Γ Σ,cp(T ΣJ Σ E,φ)Γ Σ,cp(J Σ T ΣE,φ)Ω Σ,cp 0,0(E)Mod \Gamma_{\Sigma,cp}(T_\Sigma J^\infty_\Sigma E,\varphi) \simeq \Gamma_{\Sigma,cp}(J^\infty_\Sigma T_\Sigma E,\varphi) \;\in\; \Omega^{0,0}_{\Sigma,cp}(E) Mod

for the restriction of vertical vector fields on the jet bundle to the fiberwise infinitesimal neighbourhood (example 27) of Σ×φ\Sigma \times {\varphi}.

Now we regard this as a graded module over Ω Σ,cp 0,0(E,φ)\Omega^{0,0}_{\Sigma,cp}(E,\varphi) (93) concentrated in degree 1-1:

Γ Σ,cp(J Σ T ΣE,φ)[1]Ω Σ,cp 0,0(E)Mod . \Gamma_{\Sigma,cp}(J^\infty_\Sigma T_\Sigma E,\varphi)[-1] \;\in\; \Omega^{0,0}_{\Sigma,cp}(E) Mod^{\mathbb{Z}} \,.

This is called the module of antifields corresponding the given type of fields encoded by EE.

If the field bundle is a trivial vector bundle (example 9) with field coordinates (ϕ a)(\phi^a), then we write

(96)ϕ¯ a,μ 1μ l( (ϕ μ 1μ l a))[1]Γ Σ,cp(T ΣJ Σ E,φ)[1] \overline{\phi}_{a,\mu_1 \cdots \mu_l} \;\coloneqq\; \left( \partial_{(\phi^a_{\mu_1 \cdots \mu_l})} \right)[-1] \;\in\; \Gamma_{\Sigma,cp}(T_\Sigma J^\infty_\Sigma E,\varphi)[-1]

for the vector field generator that takes derivatives along ϕ ,μ 1μ k a\partial_{\phi^a_{,\mu_1 \cdots \mu_k}}, but regarded now in degree -1.

Evaluation of vector fields in the total spacetime derivatives d ldx μ 1dx μ lδLΩ Σ p,0(E)δΩ Σ 0,0(E)\frac{d^l}{d x^{\mu_1} \cdots d x^{\mu_l}} \delta\mathbf{L} \in \Omega^{p,0}_\Sigma(E) \wedge \delta \Omega^{0,0}_\Sigma(E) of the variational derivative (prop. 22) yields a linear map over Ω Σ,cp ,(E,φ)\Omega^{\bullet,\bullet}_{\Sigma,cp}(E,\varphi) (94)

ι ()δ ELL:Γ Σ,cp(J Σ T ΣE,φ)[1]Ω Σ,cp p+1,0(E,φ). \iota_{(-)}\delta_{EL} \mathbf{L} \;\colon\; \Gamma_{\Sigma,cp}( J^\infty_\Sigma T_\Sigma E,\varphi)[-1] \longrightarrow \Omega^{p+1,0}_{\Sigma,cp}(E,\varphi) \,.

If we use the volume form dvol Σdvol_\Sigma on spacetime Σ\Sigma to induce an identification

Ω Σ p+1,0(E)C (J Σ (E))dvol σ \Omega^{p+1,0}_\Sigma(E) \;\simeq\; C^\infty(J^\infty_\Sigma(E))\langle dvol_\sigma\rangle

with respect to which the Lagrangian density decomposes as

L=Ldvol Σ \mathbf{L} = L dvol_\Sigma

then this is a Ω σ 0,0(E,φ)\Omega^{0,0}_\sigma(E,\varphi)-linear map of the form

ι ()δL EL:Γ Σ,cp ev(T ΣE,φ)[1]Ω Σ,cp 0,0(E,φ). \iota_{(-)}{\delta L_{EL}} \;\colon\; \Gamma_{\Sigma,cp}^{ev}(T_\Sigma E,\varphi)[-1] \longrightarrow \Omega^{0,0}_{\Sigma,cp}(E,\varphi) \,.

In the special case that the field bundle EfbΣE \overset{fb}{\to} \Sigma is a trivial vector bundle (example 9) with field coordinates (ϕ a)(\phi^a) so that the Euler-Lagrange variational derivative has the coordinate expansion

δ L=δ ELLδϕ aδϕ a \delta_L \;=\; \frac{\delta_{EL}\mathbf{L}}{\delta \phi^a} \delta \phi^a

then this map is given on the antifield basis elements (96) by

ι ()δL EL:ϕ¯ a,μ 1μ ld ldx μ 1dx μ lδ ELLδϕ a. \iota_{(-)} {\delta L_{EL}} \;\colon\; \overline{\phi}_{a,\mu_1 \cdots \mu_l} \;\mapsto\; \frac{d^l}{d x^{\mu_1} \cdots d x^{\mu_l}} \frac{\delta_{EL} L}{\delta \phi^a} \,.

Consider then the graded symmetric algebra

C (J Σ ((T ΣE)[1]× ΣE,φ))Sym Ω Σ,cp 0,0(E,φ)(Γ Σ,cp(J Σ T ΣE,φ)[1]) C^\infty( J^\infty_\Sigma((T_\Sigma E)[-1] \times_\Sigma E, \varphi) ) \;\coloneqq\; Sym_{\Omega^{0,0}_{\Sigma,cp}(E,\varphi)}\left( \Gamma_{\Sigma,cp}(J^\infty_\Sigma T_\Sigma E,\varphi)[-1] \right)

which is generated over Ω Σ,cp 0,0(E,φ)\Omega^{0,0}_{\Sigma,cp}(E,\varphi) from the module of vector fields in degree -1.

If we think of a single vector field as a fiber-wise linear function on the cotangent bundle, and of a multivector field similarly as a multilinear function on the cotangent bundle, then we may think of this as the algebra of functions on the infinitesimal neighbourhood (example 27) of φ\varphi inside the graded manifold (T ΣE)[1]× ΣE(T_\Sigma E)[-1] \times_\Sigma E.

Let now

(97)s BV:C (J Σ ((T ΣE)[1]× ΣE,φ))C (J Σ ((T ΣE)[1]× ΣE,φ)) s_{BV} \;\colon\; C^\infty( J^\infty_\Sigma((T_\Sigma E)[-1] \times_\Sigma E, \varphi) ) \;\longrightarrow\; C^\infty( J^\infty_\Sigma((T_\Sigma E)[-1] \times_\Sigma E, \varphi) )

be the unique extension of the linear map ι ()δ ELL\iota_{(-)}{\delta_{EL} L} to an \mathbb{R}-linear derivation of degree +1 on this algebra.

The resulting differential graded-commutative algebra over \mathbb{R}

Ω Σ,cp 0,0(E,φ)| BV(C (J Σ ((T ΣE)[1]× ΣE,φ)),s BV) \Omega^{0,0}_{\Sigma,cp}(E,\varphi)\vert_{\mathcal{E}_{BV}} \;\coloneqq\; \left( C^\infty( J^\infty_\Sigma((T_\Sigma E)[-1] \times_\Sigma E, \varphi) ) \,,\, s_{BV} \right)

is called the local BV-complex of the Lagrangian field theory at the background solution φ\varphi. This is the CE-algebra of the infintiesimal neighbourhood of Σ×{φ}\Sigma \times \{\varphi\} in the derived prolonged shell (def. 120). In this case, in the absence of any explicit infinitesimal gauge symmetries, this is an example of a Koszul complex.

There are canonical homomorphisms of dgc-algebras, one from the algebra of functions Ω Σ,cp 0,0(E,φ)\Omega^{0,0}_{\Sigma,cp}(E,\varphi) on the infinitesimal neighbourhood of the background solution φ\varphi to the local BV-complex and from there to the local observables on the neighbourhood of the background solution φ\varphi (94), all considered with compact spacetime support:

Ω Σ,cp 0,0(E,φ)Ω Σ,cp 0,0(E,φ)| BVΩ Σ,cp 0,0(E,φ)| \Omega^{0,0}_{\Sigma,cp}(E,\varphi) \longrightarrow \Omega^{0,0}_{\Sigma,cp}(E,\varphi)\vert_{\mathcal{E}_{BV}} \longrightarrow \Omega^{0,0}_{\Sigma,cp}(E,\varphi)\vert_{\mathcal{E}}

such that the composite is the canonical quotient coprojection.

Similarly we obtain a factorization for the entire variational bicomplex:

(98)Ω Σ ,(E,φ)Ω Σ ,(E,φ)| BVΩ Σ ,(E,φ)| , \Omega^{\bullet,\bullet}_\Sigma(E,\varphi) \longrightarrow \Omega^{\bullet,\bullet}_\Sigma(E,\varphi)\vert_{\mathcal{E}_{BV}} \longrightarrow \Omega^{\bullet,\bullet}_\Sigma(E,\varphi)\vert_{\mathcal{E}} \,,

where Ω Σ ,(E,φ)| BV\Omega^{\bullet,\bullet}_\Sigma(E,\varphi)\vert_{\mathcal{E}_{BV}} is now triply graded, with three anti-commuting differentials dd δ\delta and s BVs_{BV}.

By construction this is now such that the local observables (def. 83) are the cochain cohomology of this complex in horizontal form degree p+1, vertical degree 0 and BV-degree 0:

LocObs Σ(E)Ω Σ,cp p+1,0(E)/(im(s BV+d)). LocObs_\Sigma(E) \simeq \Omega^{p+1,0}_{\Sigma,cp}(E)/(im(s_{BV} + d)) \,.

\,

States

Definition

(states)

Given a Lagrangian field theory (E,L)(E,\mathbf{L}), then a (classical) state is a function from the space Obs ΣObs_\Sigma of observables to the complex numbers

:Obs Σ \langle -\rangle \;\colon\; Obs_\Sigma \longrightarrow \mathbb{C}

such that

  1. (linearity) this is a linear map;

  2. (positivity) for any AObsA \in Obs we have that A *A0\langle A^\ast A \rangle \geq 0

Below we consider quantum states. These are defined formally in just the same way, only that now the algebra of observables is equipped with another product, which changes the meaning of the product expression A *AA^\ast A.

\,

This concludes our discussion of observables. In the next chapter we consider the construction of the covariant phase space and of the Poisson-Peierls bracket on observables.

\,

Phase space

It might seem that with the construction of the local observables (def. 83) on the on-shell space of field histories (prop. 22) the field theory defined by a Lagrangian density (def. 60) has been completely analyzed: This data specifies, in principle, which field histories are realized, and which observable properties these have.

In particular, if the Euler-Lagrange equations of motion (def. 61) admit Cauchy surfaces (def. 87 below), i.e. spatial codimension 1 slices of spacetimes such that a field history is uniquely specified already by its restriction to the infinitesimal neighbourhood of that spatial slice, then a sufficiently complete collection of local observables whose spacetime support (def. 81) covers that Cauchy surface allows to predict the evolution of the field histories through time from that Cauchy surface.

This is all what one might think a theory of physical fields should accomplish, and in fact this is essentially all that was thought to be required of a theory of nature from about Isaac Newton’s time to about Max Planck’s time.

But we have seen that a remarkable aspect of Lagrangian field theory is that the de Rham differential of the local Lagrangian density L\mathbf{L} (def. 60) decomposes into two kinds of variational differential forms (prop. 22), one of which is the Euler-Lagrange form which determines the equations of motion (47).

However, there is a second contribution: The presymplectic current Ω BFVΩ Σ p,2(E)\Omega_{BFV} \in \Omega^{p,2}_{\Sigma}(E) (52). Since this is of horizontal degree pp, its transgression (def. 82) implies a further structure on the space of field histories restricted to spacetime submanifolds of dimension pp (i.e. of spacetime “codimension 1”). There may be such submanifolds such that this restriction to their infinitesimal neighbourhood (example 27) does not actually change the on-shell space of field histories, these are called the Cauchy surfaces (def. 87 below).

By the Hamiltonian Noether theorem (prop. 35) the presymplectic current induces infinitesimal symmetries acting on field histories and local observables, given by the local Poisson bracket (prop. 36). The transgression (def. 82) of the presymplectic current to these Cauchy surfaces yields the corresponding infinitesimal symmetry group acting on the on-shell field histories, whose Lie bracket is the Poisson bracket pairing on on-shell observables (example 70 below). This data, the on-shell space of field histories on the infinitesimal neighbourhood of a Cauchy surface equipped with infinitesimal symmetry exhibited by the Poisson bracket is called the phase space of the theory (def. 88) below.

In fact if enough Cauchy surfaces exist, then the presymplectic forms associated with any one choice turn out do agree after pullback to the full on-shell space of field histories, exhibiting this as the covariant phase space of the theory (prop. 46 below) which is hence manifestly independent of aa choice of space/time splitting. Accordingly, also the Poisson bracket on on-shell observables exists in a covariant form; for free field theories with Green hyperbolic equations of motion (def. 79) this is called the Peierls-Poisson bracket (theorem 2 below). The integral kernel for this Peierls-Poisson bracket is called the causal propagator (prop. 42). Its “normal ordered” or “positive frequency component”, called the Hadamard propagator (def. 107 below) as well as the corresponding time-ordered variant, called the Feynman propagator (def. 108 below), which we discuss in detail in Propagators below, control the causal perturbation theory for constructing perturbative quantum field theory by deforming the commutative pointwise product of on-shell observables to a non-commutative product governed to first order by the Peierls-Poisson bracket.

To see how such a deformation quantization comes about conceptually from the phase space strucure, notice from the basic principles of homotopy theory that given any structure on a space which is invariant with respect to a symmetry group acting on the space (here: the presymplectic current) then the true structure at hand is the homotopy quotient of that space by that symmetry group. We will explain this further below. This here just to point out that the homotopy quotient of the phase space by the infinitesimal symmetries of the presymplectic current is called the symplectic groupoid and that the true algebra of observables is hence the (polarized) convolution algebra of functions on this groupoid. This turns out to the “algebra of quantum observables” and the passage from the naive local observables on presymplectic phase space to this non-commutative algebra of functions on its homotopy quotient to the symplectic groupoid is called quantization. This we discuss in much detail below; for the moment this is just to motivate why the covariant phase space is the crucial construction to be extracted from a Lagrangian field theory.

\,

{on-shell space of field histories restricted to Cauchy surface} homotopy quotient by infinitesimal symmetries {covariant phase space} Lie algebra of functions {Poisson algebra of observables} Lie integration quantization {symplectic groupoid} polarized convolution algebra {quantum algebra of observables} \array{ \left\{ \array{ \text{on-shell space} \\ \text{ of field histories} \\ \text{restricted to} \\ \text{Cauchy surface} } \right\} &\overset{\array{ \text{homotopy} \\ \text{quotient} \\ \text{by} \\ \text{infinitesimal} \\ \text{symmetries} }}{\longrightarrow} & \left\{ \array{ \text{covariant} \\ \text{phase space} } \right\} &\overset{ \array{\text{Lie algebra} \\ \text{of functions} } }{\longrightarrow}& \left\{ \array{ \text{Poisson algebra} \\ \text{of observables} } \right\} \\ & \searrow & \Big\downarrow{}^\mathrlap{{\text{Lie integration}}} && {}^{\mathllap{quantization}}\Big\downarrow \\ && \left\{ \array{ \text{symplectic} \\ \text{groupoid} } \right\} & \overset{ \array{ \text{polarized} \\ \text{convolution} \\ \text{algebra} } }{\longrightarrow}& \left\{ \array{ \text{quantum algebra} \\ \text{of observables} } \right\} }

\,

We now discuss these topics:

\,

\,

Covariant phase space

Definition

(Cauchy surface)

Given a Lagrangian field theory (E,L)(E, \mathbf{L}) on a spacetime Σ\Sigma (def. 60), then a Cauchy surface is a submanifold Σ pΣ\Sigma_p \hookrightarrow \Sigma (def. 44) such that the restriction map from the on-shell space of field histories Γ Σ(E) δ ELL=0\Gamma_\Sigma(E)_{\delta_{EL}\mathbf{L} = 0} (62) to the space Γ Σ p(E) δ ELL=0\Gamma_{\Sigma_p}(E)_{\delta_{EL}\mathbf{L} = 0} (63) of on-shell field histories restricted to the infinitesimal neighbourhood of Σ p\Sigma_p (example 27) is an isomorphism:

(99)Γ Σ(E) δ ELL=0()| N ΣΣ pΓ Σ p(E) δ ELL=0. \Gamma_\Sigma(E)_{\delta_{EL} \mathbf{L} = 0 } \underoverset{\simeq}{(-)\vert_{N_\Sigma \Sigma_p}}{\longrightarrow} \Gamma_{\Sigma_p}(E)_{\delta_{EL}\mathbf{L} = 0} \,.
Definition

(phase space associated with a Cauchy surface)

Given a Lagrangian field theory (E,L)(E, \mathbf{L}) on a spacetime Σ\Sigma (def. 60) and given a Cauchy surface Σ pΣ\Sigma_p \hookrightarrow \Sigma (def. 87) then the corresponding phase space is

  1. the super smooth set Γ Σ p(E) δ ELL=0\Gamma_{\Sigma_p}(E)_{\delta_{EL}\mathbf{L} = 0} (63) of on-shell field histories restricted to the infinitesimal neighbourhood of Σ p\Sigma_p;

  2. equipped with the differential 2-form (as in def. 37)

    (100)ω Σ pτ Σ p(Ω BFV)Ω 2(Γ Σ p(E) δ ELL=0) \omega_{\Sigma_p} \;\coloneqq\; \tau_{\Sigma_p}\left(\Omega_{BFV}\right) \;\in\; \Omega^2\left( \Gamma_{\Sigma_p}(E)_{\delta_{EL}\mathbf{L} = 0} \right)

    which is the distributional transgression (def. 82) of the presymplectic current Ω BFV\Omega_{BFV} (def. 22) to Σ p\Sigma_p.

    This ω Σ p\omega_{\Sigma_p} is a closed differential form in the sense of def. 37, due to prop. 44 and using that Ω BFV=δΘ BFV\Omega_{BFV} = \delta \Theta_{BFV} is closed by definition (52). As such this is called the presymplectic form on the phase space.

Example

(evaluation of transgressed variational form on tangent vectors for free field theory)

Let (E,L)(E,\mathbf{L}) be a Lagrangian field theory (def. 60) which is free (def. 62) hence whose field bundle is a some smooth super vector bundle (example 9) and whose Euler-Lagrange equation of motion is linear. Then the synthetic tangent bundle (def. 26) of the on-shell space of field histories Γ Σ(E) δ ELL=0\Gamma_{\Sigma}(E)_{\delta_{EL}\mathbf{L} = 0} (62) with spacelike compact support (def 31) is canonically identified with the Cartesian product of this super smooth set with itself

T(Γ Σ,scp(E) δ ELL=0)(Γ Σ,scp(E) δ ELL=0)×(Γ Σ,scp(E) δ ELL=0). T\left( \Gamma_{\Sigma,scp}(E)_{\delta_{EL} \mathbf{L} = 0} \right) \;\simeq\; \left(\Gamma_{\Sigma,scp}(E)_{\delta_{EL} \mathbf{L} = 0}\right) \times \left(\Gamma_{\Sigma,scp}(E)_{\delta_{EL} \mathbf{L} = 0}\right) \,.

With field coordinates as in example 9, we may expand the presymplectic current as

Ω BFV=(Ω BFV) a 1a 2 μ 1,,μ k 1,ν 1,,ν k 2,κδϕ μ 1μ k a 1δϕ ν 1ν k 2 a 2ι κdvol Σ, \Omega_{BFV} = \left(\Omega_{BFV}\right)^{\mu_1, \cdots, \mu_{k_1}, \nu_1, \cdots, \nu_{k_2}, \kappa}_{a_1 a_2} \delta \phi^{a_1}_{\mu_1 \cdots \mu_k} \wedge \delta \phi^{a_2}_{\nu_1 \cdots \nu_{k_2}} \wedge \iota_{\partial_\kappa} dvol_\Sigma \,,

where the components (Ω BFV) a 1a 2 μ 1,,μ k 1,ν 1,,ν k 2,κ(\Omega_{BFV})_{a_1 a_2}^{\mu_1, \cdots, \mu_{k_1}, \nu_1, \cdots, \nu_{k_2}, \kappa} are smooth functions on the jet bundle.

Under these identifications the value of the presymplectic form ω Σ p\omega_{\Sigma_p} (100) on two tangent vectors Φ 1,Φ 2Γ Σ,scp(E)\vec \Phi_1, \vec \Phi_2 \in \Gamma_{\Sigma,scp}(E) at a point Φ 1,Φ 2Γ Σ,scp(E)\Phi_1, \Phi_2 \in \Gamma_{\Sigma,scp}(E) is

ω Σ p(Φ 1,Φ 2)=Σ p(Ω BFV) a 1a 2 μ 1,,μ k 1,ν 1,,ν k 2,κ(Φ 1(x),Φ 2(x))(x μ 1x μ k 1Φ 1(x))(x ν 1x ν k 2Φ 2(x))ι κdvol Σ(x). \omega_{\Sigma_p}(\vec \Phi_1, \vec \Phi_2) \;=\; \underset{\Sigma_p}{\int} \left(\Omega_{BFV}\right)^{\mu_1, \cdots, \mu_{k_1}, \nu_1, \cdots, \nu_{k_2}, \kappa}_{a_1 a_2}(\Phi_1(x), \Phi_2(x)) \left( \frac{\partial}{\partial x^{\mu_1}} \cdots \frac{\partial}{\partial x^{\mu_{k_1}}} \vec \Phi_1(x) \right) \left( \frac{\partial}{\partial x^{\nu_1}} \cdots \frac{\partial}{\partial x^{\nu_{k_2}}} \vec \Phi_2(x) \right) \, \iota_{\partial_\kappa} dvol_\Sigma(x) \,.
Example

(presymplectic form for free real scalar field)

Consider the Lagrangian field theory for the free real scalar field from example 39.

Under the identification of example 70 the presymplectic form on the phase space (def. 88) associated with a Cauchy surface Σ pΣ\Sigma_p \hookrightarrow \Sigma is given by

ω Σ p(Φ 1,Φ 2) = Σ p(Φ 1x μ(x)Φ 2(x)Φ 1(x)Φ 2x μ(x))η μνι μdvol Σ p(x) =Σ pK(Φ 1,Φ 2). \begin{aligned} \omega_{\Sigma_p}(\vec \Phi_1, \vec\Phi_2) & = \int_{\Sigma_{p}} \left( \frac{\partial \vec \Phi_1}{\partial x^\mu}(x) \vec \Phi_2(x) - \vec \Phi_1(x) \frac{\partial \vec \Phi_2}{\partial x^\mu}(x) \right) \eta^{\mu \nu} \iota_{\partial_\mu} dvol_{\Sigma_{p}}(x) \\ & = \underset{\Sigma_p}{\int} K(\vec \Phi_1, \vec \Phi_2) \,. \end{aligned}

Here the first equation follows via example 70 from the form of Ω BFV\Omega_{BFV} from example 45, while the second equation identifies the integrand as the witness KK for the formally self-adjointness of the Klein-Gordon equation from example 51.

Example

(presymplectic form for free Dirac field)

Consider the Lagrangian field theory of the free Dirac field (example 43).

Under the identification of example 70 the presymplectic form on the phase space (def. 88) associated with a Cauchy surface Σ pΣ\Sigma_p \hookrightarrow \Sigma is given by

ω Σ p(θ 1Ψ 1,θ 2Ψ 2) = Σ p(θ 1ψ 1¯γ μ(θ 2Ψ 2))ι μdvol Σ p(x) =Σ pK(Φ 1,Φ 2). \begin{aligned} \omega_{\Sigma_p}(\theta_1 \vec \Psi_1, \theta_2 \vec\Psi_2) & = \int_{\Sigma_{p}} \left( \overline{\theta_1 \vec \psi_1}\gamma^\mu \left( \theta_2 \vec \Psi_2 \right) \right) \iota_{\partial_\mu} dvol_{\Sigma_{p}}(x) \\ & = \underset{\Sigma_p}{\int} K(\vec \Phi_1, \vec \Phi_2) \,. \end{aligned}

Here the first equation follows via example 70 from the form of Ω BFV\Omega_{BFV} from example 49, while the second equation identifies the integrand as the witness KK for the formally self-adjointness of the Dirac equation from example 53.

Proposition

(covariant phase space)

Consider (E,L)(E, \mathbf{L}) a Lagrangian field theory on a spacetime Σ\Sigma (def. 60).

Let

Σ tratraΣ \Sigma_{tra} \overset{tra}{\hookrightarrow} \Sigma

be a submanifold with two boundary components Σ tra=Σ inΣ out\partial \Sigma_{tra} = \Sigma_{in} \sqcup \Sigma_{out} , both of which are Cauchy surfaces (def. 87).

Then the corresponding inclusion diagram

Σ tra in out Σ in Σ out \array{ && \Sigma_{tra} \\ & {}^{\mathllap{in}}\nearrow && \nwarrow^{\mathrm{out}} \\ \Sigma_{in} && && \Sigma_{out} }

induces a Lagrangian correspondence between the associated phase spaces (def. 88)

Γ Σ tra(E) δ ELL=0 ()| in ()| out Γ Σ (in)(E) δ ELL=0 Γ Σ (out)(E) δ ELL=0 ω in ω out Ω 2 \array{ && \Gamma_{\Sigma_{tra}}(E)_{\delta_{EL} \mathbf{L} = 0} \\ & {}^{\mathllap{ (-)\vert_{in} }}\swarrow && \searrow^{\mathrlap{ (-)\vert_{out} }} \\ \Gamma_{\Sigma^{(in)}}(E)_{\delta_{EL}\mathbf{L}= 0} && && \Gamma_{\Sigma^{(out)}}(E)_{\delta_{EL}\mathbf{L}= 0} \\ & {}_{\mathllap{\omega_{in}}}\searrow && \swarrow_{\mathrlap{\omega_{out}}} \\ && \mathbf{\Omega}^{2} }

in that the pullback of the two presymplectic forms (100) coincides on the space of field histories:

(()| in) *(ω in)=(()| out) *(ω out)AAAAΩ 2(Γ Σ tra(E) δ ELL=0). \left( (-)\vert_{in}\right)^\ast\left( \omega_{in}\right) \;=\; \left( (-)\vert_{out} \right)^\ast \left( \omega_{out} \right) \phantom{AAAA} \in \Omega^2 \left( \Gamma_{\Sigma_{tra}}(E)_{\delta_{EL} \mathbf{L} = 0} \right) \,.

Hence there is a well defined presymplectic form

ωΩ 2(Γ Σ(E) δ ELL=0) \omega \in \Omega^2\left( \Gamma_\Sigma(E)_{\delta_{EL}\mathbf{L}} = 0 \right)

on the genuine space of field histories, given by ωi *ω Σ p\omega \coloneqq i^\ast \omega_{\Sigma_p} for any Cauchy surface Σ piΣ\Sigma_p \overset{i}{\hookrightarrow} \Sigma. This presymplectic smooth space

(Γ Σ(E) δ ELL,ω) \left( \Gamma_\Sigma(E)_{\delta_{EL}\mathbf{L}} \,,\, \omega \right)

is therefore called the covariant phase space of the Lagrangian field theory (E,L)(E,\mathbf{L}).

Proof

By prop. 23 the total spacetime derivative dΩ BFVd \Omega_{BFV} of the presymplectic current vanishes on-shell:

dΩ BFV=δδ ELL d \Omega_{BFV} = - \delta \delta_{EL} \mathbf{L}

in that the pullback (def. 10) along the shell inclusion i J Σ (E)\mathcal{E} \overset{i_{\mathcal{E}}}{\hookrightarrow} J^\infty_\Sigma(E) (48) vanishes:

(i ) *(dΩ BFV) =(i ) *(δδ EL) =δ(i ) *(δ ELL)=0 =0 \begin{aligned} (i_{\mathcal{E}})^\ast \left( d \Omega_{BFV} \right) & = - (i_{\mathcal{E}})^\ast \left( \delta \delta_{EL} \mathcal{L} \right) \\ & = - \delta \underset{ = 0 }{ \underbrace{ (i_{\mathcal{E}})^\ast \left( \delta_{EL} \mathbf{L} \right) } } \\ & = 0 \end{aligned}

This implies that the transgression of dΩ BFVd \Omega_{BFV} to the on-shell space of field histories Γ Σ tra(E) δ ELL=0\Gamma_{\Sigma_{tra}}(E)_{\delta_{EL}\mathbf{L} = 0} vanishes (since by definition (61) that involves pulling back through the shell inclusion)

τ Σ tra(dΩ BFV)=0. \tau_{\Sigma_{tra}}(d \Omega_{BFV}) = 0 \,.

But then the claim follows with prop. 44:

0 =τ Σ tra(dΩ BFV) =(()| Σ tra) *τ Σ traΩ BFV. \begin{aligned} 0 & = \tau_{\Sigma_{tra}}(d \Omega_{BFV}) \\ & = ((-)\vert_{\Sigma_{tra}})^\ast \tau_{\partial \Sigma_{tra}} \Omega_{BFV} \,. \end{aligned}
Theorem

(polynomial Poisson bracket on covariant phase space – the Peierls bracket)

Let (E,L)(E,\mathbf{L}) be a Lagrangian field theory (def. 60) such that

  1. it is a free field theory (def. 62)

  2. whose Euler-Lagrange equation of motion PΦ=0P \Phi = 0 (def. 61) is

    1. formally self-adjoint or formally anti self-adjoint (def. 58) such that

      • the integral over the witness KK (33) is the presymplectic form (100): ω Σ p=Σ pK\omega_{\Sigma_p} = \underset{\Sigma_p}{\int} K;
    2. Green hyperbolic (def. 79).

Write

G P:LinObs(E scp,L) regG PΓ Σ,scp(E) δ ELL=0 \mathrm{G}_P \;\colon\; LinObs(E_{scp},\mathbf{L})^{reg} \overset{\mathrm{G}_P}{\longrightarrow} \Gamma_{\Sigma,scp}(E)_{\delta_{EL}\mathbf{L} = 0}

for the linear map from regular linear field observables (def. 80) to on-shell field histories with spatially compact support (def. 31) given under the identification (90) by the causal Green function G P\mathrm{G}_P (def. 78).

Then for every Cauchy surface Σ pΣ\Sigma_p \hookrightarrow \Sigma (def. 87) this map is an inverse to the presymplectic form ω Σ p\omega_{\Sigma_p} (def. 88) in that, under the identification of tangent vectors to field histories from example 70, we have that the composite

(101)ω Σ p(G P(),())=ev : LinObs(E scp,L) reg Γ Σ,scp(E) (A , Φ) A(Φ) \array{ \omega_{\Sigma_p}(\mathrm{G}_P(-),(-)) \;=\; ev &\colon& LinObs(E_{scp},\mathbf{L})^{reg} &\otimes& \Gamma_{\Sigma,scp}(E) &\longrightarrow& \mathbb{C} \\ && (A &,& \Phi) &\mapsto& A(\Phi) }

equals the evaluation map of observables on field histories.

This means that for every Cauchy surface Σ p\Sigma_p the presymplectic form ω Σ p\omega_{\Sigma_p} restricts to a symplectic form on regular linear observables. The corresponding Poisson bracket is

{,} Σ pω Σ p(G P(),G P()):LinObs(E scp,L) regLinObs(E scp,L) reg. \left\{ -,- \right\}_{\Sigma_p} \;\coloneqq\; \omega_{\Sigma_p}(\mathrm{G}_P(-), \mathrm{G}_P(-)) \;\;\colon\;\; LinObs(E_{scp},\mathbf{L})^{reg} \otimes LinObs(E_{scp},\mathbf{L})^{reg} \longrightarrow \mathbb{R} \,.

Moreover, equation (101) implies that this is the covariant Poisson bracket in the sense of the covariant phase space (def. 46) in that it does not actually depend on the choice of Cauchy surface.

An equivalent expression for the Poisson bracket that makes its independence from the choice of Cauchy surface manifest is the PP-Peierls bracket given by

(102)LinObs(E scp,L) regLinObs(E scp,L) reg {,} (α *,β *) ΣG(α *)β *dvol Σ \array{ LinObs(E_{scp},\mathbf{L})^{reg} \otimes LinObs(E_{scp},\mathbf{L})^{reg} &\overset{\{-,-\}}{\longrightarrow}& \mathbb{R} \\ (\alpha^\ast, \beta^\ast) &\mapsto& \underset{\Sigma}{\int} \mathrm{G}(\alpha^\ast) \cdot \beta^\ast \, dvol_\Sigma }

where on the left α *,β *Γ Σ,cp(E *)LinObs(E scp,L) reg\alpha^\ast, \beta^\ast \in \Gamma_{\Sigma,cp}(E^\ast) \simeq LinObs(E_{scp},\mathbf{L})^{reg}

Hence under the given assumptions, for every Cauchy surface the Poisson bracket associated with that Cauchy surface equals the invariantly (“covariantly”) defined Peierls bracket

{,} Σ p={,}. \{-,-\}_{\Sigma_p} = \{-,-\} \,.

Finally this means that in terms of the causal propagator Δ\Delta (84) the covariant Peierls-Poisson bracket is given in generalized function-notation by

(103){α *,β *}=ΣΣα *(x)Δ(x,y)β *(y)dvol Σ(x)dvol Σ(y) \{\alpha^\ast, \beta^\ast\} \;=\; \underset{\Sigma}{\int} \underset{\Sigma}{\int} \alpha^\ast(x) \cdot \Delta(x,y) \cdot \beta^\ast(y) \, dvol_\Sigma(x)\, dvol_\Sigma(y)

Therefore, while the point-evaluation field observables Φ a(x)\mathbf{\Phi}^a(x) (def. 60) are not themselves regular observables (def. 80), the Peierls-Poisson bracket (103) is induced from the following distributional bracket between them

{Φ a(x),Φ b(y)}=Δ ab(x,y) \left\{ \mathbf{\Phi}^a(x) , \mathbf{\Phi}^b(y) \right\} \;=\; \Delta^{a b}(x,y)

with the causal propagator (84) on the right, in that with the identification (91) the Peierls-Poisson bracket on regular linear observables arises as follows:

{Σα a *(x)Φ a(x)dvol Σ(x),Σβ b *(y)Φ b(y)dvol Σ(y)} =ΣΣα a *(x){Φ a(x),Φ b(y)}=Δ ab(x,y)β b *(y)dvol Σ(x)dvol Σ(y) =ΣΣα a *(x)Δ ab(x,y)β b *(y)dvol Σ(x)dvol Σ(y) \begin{aligned} \left\{ \underset{\Sigma}{\int} \alpha^\ast_a(x) \mathbf{\Phi}^a(x) \, dvol_\Sigma(x) \,,\, \underset{\Sigma}{\int} \beta^\ast_b(y) \mathbf{\Phi}^b(y) \, dvol_\Sigma(y) \right\} & = \underset{\Sigma}{\int} \underset{\Sigma}{\int} \alpha^\ast_a(x) \underset{= \Delta^{a b}(x,y)}{ \underbrace{ \left\{ \mathbf{\Phi}^a(x), \mathbf{\Phi}^b(y) \right\} } } \beta^\ast_b(y) \, dvol_\Sigma(x)\, dvol_\Sigma(y) \\ & = \underset{\Sigma}{\int} \underset{\Sigma}{\int} \alpha^\ast_a(x) \Delta^{a b}(x,y) \beta^\ast_b(y) \, dvol_\Sigma(x)\, dvol_\Sigma(y) \end{aligned}

(Khavkine 14, lemma 2.5)

Proof

Consider two more Cauchy surfaces Σ p ±I ±(Σ)Σ\Sigma_p^\pm \hookrightarrow I^\pm(\Sigma) \hookrightarrow \Sigma, in the future I +I^+ and in the past I I^- of Σ\Sigma, respectively. Choose a partition of unity on Σ\Sigma consisting of two elements χ ±C (Σ)\chi^\pm \in C^\infty(\Sigma) with support bounded by these Cauchy surfaces: supp(χ ±)I ±(Σ )supp(\chi_\pm) \subset I^\pm(\Sigma^{\mp}).

Then define

(104)P χ:Γ Σ,scp(E)Γ Σ,cp(E *) P_\chi \;\colon\; \Gamma_{\Sigma,scp}(E) \longrightarrow \Gamma_{\Sigma,cp}(E^\ast)

by

(105)P χ(Φ) P(χ +Φ) =P(χ Φ). \begin{aligned} P_\chi(\Phi) & \coloneqq \phantom{-} P(\chi_+ \Phi) \\ & = - P(\chi_- \Phi) \,. \end{aligned}

Notice that the support of the partitioned field history is in the compactly sourced future/past cone

(106)χ ±ΦΓ Σ,±cp(E) \chi_\pm \Phi \;\in\; \Gamma_{\Sigma,\pm cp}(E)

since Φ\Phi is supported in the compactly sourced causal cone, but that P(χ ±Φ)P(\chi_\pm \Phi) indeed has compact support as required by (104): Since P(Φ)=0P(\Phi) = 0, by assumption, the support is the intersection of that of Φ\Phi with that of dχ ±d \chi_\pm, and the first is spacelike compact by assumption, while the latter is timelike compact, by definition of partition of unity.

Similarly, the equality in (105) holds because by partition of unity P(χ +Φ)+P(χ Φ)=P((χ ++χ )Φ)=P(Φ)=0P(\chi_+ \Phi) + P(\chi_-\Phi) = P((\chi_+ + \chi_-)\Phi ) = P(\Phi) = 0.

It follows that

(107)G PP χ(Φ) =(G P,+G P,)P χ(Φ) =G P,+P(χ +Φ)=χ +Φ+G P,P(χ Φ)=χ Φ =(χ ++χ )Φ =Φ, \begin{aligned} \mathrm{G}_P \circ P_\chi (\Phi) & = \left( \mathrm{G}_{P,+} - \mathrm{G}_{P,-} \right) P_\chi (\Phi) \\ & = \underset{ = \chi_+ \Phi}{\underbrace{\mathrm{G}_{P,+} P(\chi_+ \Phi)}} + \underset{ = \chi_- \Phi }{\underbrace{\mathrm{G}_{P,-} P(\chi_- \Phi)}} \\ & = (\chi_+ + \chi_-)\Phi \\ & = \Phi \,, \end{aligned}

where in the second line we chose from the two equivalent expressions (105) such that via (106) the defining property of the advanced or retarded Green function, respectively, may be applied, as shown under the braces.

(Khavkine 14, lemma 2.1)

Now we apply this to the computation of ω Σ p(G P(),)\omega_{\Sigma_p}(\mathrm{G}_P(-),-):

ω Σ P(G P(α *),Φ) =Σ PK(G P(α *),Φ) =Σ PK(G P(α *),χ +Φ)+Σ PK(G P(α *),χ Φ) =I (Σ P)dK(G P(α *),χ +Φ)I +(Σ P)dK(G P(α *),χ Φ) =I (Σ P)(P(G P(α *))=0χ +ΦG P(α *)P(χ +Φ))dvol ΣI +(Σ P)(P(G P(α *))=0χ ΦG P(α *)P(χ Φ))dvol Σ =(I (Σ P)G P(α *)P(χ +Φ)dvol Σ+I +(Σ P)G P(α *)P(χ +Φ)dvol Σ) =ΣG P(α *)P(χ +Φ)dvol Σ =Σα *G P(P(χ +Φ)) =Σα *Φ \begin{aligned} \omega_{\Sigma_P}(\mathrm{G}_P(\alpha^\ast),\vec \Phi) & = \underset{\Sigma_P}{\int} K(\mathrm{G}_P(\alpha^\ast), \vec \Phi) \\ & = \underset{\Sigma_P}{\int} K(\mathrm{G}_P(\alpha^\ast), \chi_+\vec \Phi) + \underset{\Sigma_P}{\int} K(\mathrm{G}_P(\alpha^\ast), \chi_-\vec \Phi) \\ & = \underset{I^-(\Sigma_P)}{\int} d K(\mathrm{G}_P(\alpha^\ast), \chi_+\vec \Phi) - \underset{I^+(\Sigma_P)}{\int} d K(\mathrm{G}_P(\alpha^\ast), \chi_-\vec \Phi) \\ & = \underset{I^-(\Sigma_P)}{\int} \left( \underset{= 0}{ \underbrace{ P(\mathrm{G}_P(\alpha^\ast))}} \cdot \chi_+\vec \Phi \mp \mathrm{G}_P(\alpha^\ast) \cdot P(\chi_+ \vec \Phi) \right) dvol_\Sigma - \underset{I^+(\Sigma_P)}{\int} \left( \underset{= 0}{ \underbrace{ P(\mathrm{G}_P(\alpha^\ast))}} \cdot \chi_-\vec \Phi \mp \mathrm{G}_P(\alpha^\ast) \cdot P(\chi_- \vec \Phi) \right) dvol_\Sigma \\ & = \mp \left( \underset{I^-(\Sigma_P)}{\int} \mathrm{G}_P(\alpha^\ast) \cdot P(\chi_+ \vec \Phi) dvol_\Sigma + \underset{I^+(\Sigma_P)}{\int} \mathrm{G}_P(\alpha^\ast) \cdot P(\chi_+ \vec \Phi) dvol_\Sigma \right) \\ & = \underset{\Sigma}{\int} \mathrm{G}_P(\alpha^\ast) \cdot P(\chi_+ \vec \Phi) dvol_\Sigma \\ & = \underset{\Sigma}{\int} \alpha^\ast \cdot \mathrm{G}_{P} (P (\chi_+ \vec \Phi)) \\ & = \underset{\Sigma}{\int} \alpha^\ast \cdot \vec \Phi \end{aligned}

Here we computed as follows:

  1. applied the assumption that ω Σ p(,)=Σ pK(,)\omega_{\Sigma_p}(-,-) = \underset{\Sigma_p}{\int} K(-,-);

  2. applied the above partition of unity;

  3. used the Stokes theorem (prop. 4) for the past and the future of Σ p\Sigma_p, respectively;

  4. applied the definition of dKd K as the witness of the formal (anti-) self-adjointness of PP (def. 58);

  5. used PG p=0P\circ \mathrm{G}_p = 0 on Γ Σ,cp(E *)\Gamma_{\Sigma,cp}(E^\ast) (def. 78) and used (105);

  6. unified the two integration domains, now that the integrands are the same;

  7. used the formally (anti-)self adjointness of the Green functions (example 65);

  8. used (107).

Example

(scalar field and Dirac field have covariant Peierls-Poisson bracket)

Examples of free Lagrangian field theories for which the assumptions of theorem 2 are satisfied, so that the covariant Poisson bracket exists in the form of the Peierls bracket include

For the free scalar field this is the statement of example 63 with example 71, while for the Dirac field this is the statement of example 64 with example 72.

For the free electromagnetic field (example 40) the assumptions of theorem 2 are violated, the covariant phase space does not exist. But in the discuss of Gauge fixing, below, we will find that for an equivalent re-incarnation of the electromagnetic field, they are met after all.

\,

BV-resolution of the covariant phase space

So far we have discussed the covariant phase space (prop. 46) in terms of explicit restriction to the shell. We now turn to the more flexible perspective where a homological resolution of the shell in terms of “antifields” is used (def. 85).

Example

(BV-presymplectic current)

Let (E,L)(E,\mathbf{L}) be a Lagrangian field theory (def. 60) whose field bundle EE is a trivial vector bundle (example 9) and whose Lagrangian density L\mathbf{L} is spacetime-independent (example 24). Let Σ×{φ}\Sigma \times \{\varphi\} \hookrightarrow \mathcal{E} be a constant section of the shell (56).

Then in the BV-variational bicomplex (98) there exists the BV-presymplectic potential

(108)Θ BVϕ¯ aδϕ advol ΣΩ Σ p,1(E,φ)| BV \Theta_{BV} \;\coloneqq\; \overline{\phi}_a \delta \phi^a \, dvol_\Sigma \;\in\; \Omega^{p,1}_\Sigma(E,\varphi)\vert_{\mathcal{E}_{BV}}

and the corresponding BV-presymplectic current

Ω BV;Ω Σ p,2(E,φ)| BV \Omega_{BV} ;\in\; \Omega^{p,2}_\Sigma(E,\varphi)\vert_{\mathcal{E}_{BV}}

defined by

Ω BV δΘ BV =δϕ¯ aδϕ advol Σ, \begin{aligned} \Omega_{BV} & \coloneqq \delta \Theta_{BV} \\ & = \delta \overline{\phi}_a \wedge \delta \phi^a \wedge dvol_{\Sigma} \end{aligned} \,,

where (ϕ a)(\phi^a) are the given field coordinates, ϕ¯ a\overline{\phi}_a the corresponding antifield coordinates (96) and δ ELLδϕ a\frac{\delta_{EL} \mathbf{L}}{\delta \phi^a} the corresponding components of the Euler-Lagrange form (prop. 22).

Proposition

(local BV-BFV relation)

Let (E,L)(E,\mathbf{L}) be a Lagrangian field theory (def. 60) whose field bundle EE is a trivial vector bundle (example 9) and whose Lagrangian density L\mathbf{L} is spacetime-independent (example 24). Let Σ×{φ}\Sigma \times \{\varphi\} \hookrightarrow \mathcal{E} be a constant section of the shell (56).

Then the BV-presymplectic current Ω BV\Omega_{BV} (def. 74) witnesses the on-shell vanishing (prop. 23) of the total spacetime derivative of the genuine presymplectic current Ω BFV\Omega_{BFV} (prop. 22) in that the total spacetime derivative of Ω BFV\Omega_{BFV} equals the BV-differential s BVs_{BV} of Ω BV\Omega_{BV}:

dΩ BFV=sΩ BV. d \Omega_{BFV} = s \Omega_{BV} \,.

Hence if Σ traΣ\Sigma_{tra} \hookrightarrow \Sigma is a submanifold of spacetime of full dimension p+1p+1 with boundary Σ tra=Σ inΣ out\partial \Sigma_{tra} = \Sigma_{in} \sqcup \Sigma_{out}

Σ tra in out Σ in Σ out \array{ && \Sigma_{tra} \\ & {}^{\mathllap{in}}\nearrow && \nwarrow^{\mathrm{out}} \\ \Sigma_{in} && && \Sigma_{out} }

then the pullback of the two presymplectic forms (100) on the incoming and outgoing spaces of field histories, respectively, differ by the BV-differential of the transgression of the BV-presymplectic current:

(()| in) *(ω in)(()| out) *(ω out)=τ 𝔻×Σ tra(sΩ BV)AAAAΩ 2(Γ Σ tra(E) δ ELL=0). \left( (-)\vert_{in}\right)^\ast\left( \omega_{in}\right) \;-\; \left( (-)\vert_{out} \right)^\ast \left( \omega_{out} \right) = \tau_{\mathbb{D} \times \Sigma_{tra}} ( s \Omega_{BV} ) \phantom{AAAA} \in \Omega^2 \left( \Gamma_{\Sigma_{tra}}(E)_{\delta_{EL} \mathbf{L} = 0} \right) \,.

This homological resolution of the Lagrangian correspondence that exhibits the “covariance” of the covariant phase space (prop. 46) is known as the BV-BFV relation (Cattaneo-Mnev-Reshetikhin 12 (9)).

Proof

For the first statement we compute as follows:

sΩ BV =δ(sϕ¯ a)δϕ advol Σ =δδ ELLδϕ aδϕ advol Σ =δδ ELL =dΩ BFV, \begin{aligned} s \Omega_{BV} & = - \delta (s \overline{\phi}_a) \delta \phi^a \wedge dvol_{\Sigma} \\ & = - \delta \frac{\delta_{EL}L }{\delta \phi^a} \delta \phi^a dvol_{\Sigma} \\ & = - \delta \delta_{EL}\mathbf{L} \\ & = d \Omega_{BFV} \,, \end{aligned}

where the first steps simply unwind the definitions, and where the last step is prop. 23.

With this the second statement follows by immediate generalization of the proof of prop. 46.

=–

Example

(derived presymplectic current of real scalar field)

Consider a Lagrangian field theory (def. 60) without any non-trivial implicit infinitesimal gauge transformations (def. \ref{ImplicitInfinitesimalGaugeSymmetry}); for instance the real scalar field from example 39.

Inside its local BV-complex (def. 85) we may form the linear combination of

  1. the presymplectic current Ω BFV\Omega_{BFV} (example 45)

  2. the BF-presymplectic current Ω BV\Omega_{BV} (example 74).

This yields a vertical 2-form

ΩΩ BV+Ω BFVΩ Σ p,2(E)| BV \Omega \;\coloneqq\; \Omega_{BV} + \Omega_{BFV} \;\; \in \Omega^{p,2}_\Sigma(E)\vert_{\mathcal{E}_{BV}}

which might be called the derived presymplectic current.

Similarly we may form the linear combination of 1. the presymplectic potential current Θ BFV\Theta_{BFV} (46)

  1. the BF-presymplectic potential current Θ BV\Theta_{BV} (108)

  2. the Lagrangian density L\mathbf{L} (def. 60)

hence

ΘΘ BV+Θ BFV+LLepage \Theta \;\coloneqq\; \Theta_{BV} + \underset{Lepage}{\underbrace{ \Theta_{BFV} + \mathbf{L} }}

(where the sum of the two terms on the right is the Lepage form (53)). This might be called the derived presymplectic potental current.

We then have that

(δ+(ds))Ω=0 (\delta + (d-s))\Omega \;=\; 0

and in fact

(δ+(ds))Θ=Ω. (\delta + (d-s))\Theta \;=\; \Omega \,.
Proof

Of course the first statement follows from the second, but in fact the two contributions of the first statement even vanish separately:

δΩ=0,AAAA(ds)Ω=0. \delta \Omega = 0 \,, \phantom{AAAA} (d-s)\Omega = 0 \,.

The statement on the left is immediate from the definitions, since Ω=δΘ\Omega = \delta \Theta. For the statement on the right we compute

(ds)(Ω BV+Ω BFV) =dΩ BFVsΩ BV=0=0+dΩ BVsΩ BFV=0 =0 \begin{aligned} (d - s) (\Omega_{BV} + \Omega_{BFV}) & = \underset{= 0}{\underbrace{d \Omega_{BFV} - \underset{ = 0 }{\underbrace{ s \Omega_{BV}}} }} + \underset{ = 0}{\underbrace{ d \Omega_{BV} - s \Omega_{BFV} }} \\ & = 0 \end{aligned}

Here the first term vanishes via the local BV-BFV relation (prop. 47) while the other two terms vanish simply by degree reasons.

Similarly for the second statement we compute as follows:

(δ+(ds))Θ =δ(Θ BV+Θ BFV)=Ω BV+Ω BFV+dL=δL+(ds)L=0+(ds)(Θ BV+Θ BFV) =Ω BV+Ω BFV+δL+dΘ BV=0sΘ BV=δ ELL+dΘ BFV=δ ELLδLsΘ BFV=0 =Ω BV+Ω BFV. \begin{aligned} (\delta + (d - s) ) \Theta & = \underset{ = \Omega_{BV} + \Omega_{BFV}}{\underbrace{ \delta (\Theta_{BV} + \Theta_{BFV}) }} + \underset{ = \delta \mathbf{L}}{\underbrace{\mathbf{d} \mathbf{L}}} + \underset{ = 0 }{\underbrace{ (d-s) \mathbf{L} }} + (d-s)(\Theta_{BV} + \Theta_{BFV}) \\ & = \Omega_{BV} + \Omega_{BFV} + \delta \mathbf{L} + \underset{ = 0}{\underbrace{d \Theta_{BV}}} - \underset{ = \delta_{EL} \mathbf{L} }{\underbrace{ s \Theta_{BV}}} + \underset{ = \delta_{EL}\mathbf{L} - \delta \mathbf{L} }{\underbrace{ d \Theta_{BFV} } } - \underset{ = 0 }{\underbrace{ s \Theta_{BFV} }} \\ & = \Omega_{BV} + \Omega_{BFV} \end{aligned} \,.

Here the direct vanishing of various terms is again by simple degree reasons, and otherwise we used the definition of Ω\Omega and, crucially, the variational identity δL=δ ELLdΘ BFV\delta \mathbf{L} = \delta_{EL}\mathbf{L} - d \Theta_{BFV} (46).

\,

Hamiltonian local observables

We have defined the local observables (def. 83) as the transgressions of horizontal p+1p+1-forms (with compact spacetime support) to the on-shell space of field histories Γ Σ(E) δ ELL=0\Gamma_{\Sigma}(E)_{\delta_{EL}\mathbf{L} = 0} over all of spacetime Σ\Sigma. More explicitly, these could be called the spacetime local observables.

But with every choice of Cauchy surface Σ pΣ\Sigma_p \hookrightarrow \Sigma (def. 87) comes another notion of local observables: those that are transgressions of horizontal pp-forms (instead of p+1p+1-forms) to the on-shell space of field histories restricted to the infinitesimal neighbourhood of that Cauchy surface (def. 42): Γ Σ p(E) δ ELL=0\Gamma_{\Sigma_p}(E)_{\delta_{EL} \mathbf{L} = 0}. These are spatially local observables, with respect to the given choice of Cauchy surface.

Among these spatially local observables are the Hamiltonian local observables (def. 89 below) which are transgressions specifically of the Hamiltonian forms (def. 70). These inherit a transgression of the local Poisson bracket (prop. 36) to a Poisson bracket on Hamiltonian local observables (def. 90 below). This is known as the Peierls bracket (example 76 below).

Definition

(Hamiltonian local observables)

Let (E,L)(E, \mathbf{L}) be a Lagrangian field theory (def. 60).

Consider a local observable (def. 83)

τ Σ(A):Γ Σ(E) δ ELL=0, \tau_\Sigma(A) \;\colon\; \Gamma_\Sigma(E)_{\delta_{EL}\mathbf{L} = 0} \longrightarrow \mathbb{C} \,,

hence the transgression of a variational horizontal p+1p+1-form AΩ Σ,cp p+1,0(E)A \in \Omega^{p+1,0}_{\Sigma,cp}(E) of compact spacetime support.

Given a Cauchy surface Σ pΣ\Sigma_p \hookrightarrow \Sigma (def. 87) we say that τ Σ(A)\tau_\Sigma (A) is Hamiltonian if it is also the transgression of a Hamiltonian differential form (def. 70), hence if there exists

(H,v)Ω Σ,Ham p,0(E) (H,v) \in \Omega^{p,0}_{\Sigma, Ham}(E)

whose transgression over the Cauchy surface Σ p\Sigma_p equals the transgression of AA over all of spacetime Σ\Sigma, under the isomorphism (99)

Γ Σ(E) δ ELL=0 ()| N ΣΣ p Γ Σ p(E) δ ELL=0 τ Σ(A) τ Σ p(H) Ω 2 \array{ \Gamma_\Sigma(E)_{\delta_{EL} \mathbf{L} = 0 } && \underoverset{\simeq}{(-)\vert_{N_\Sigma \Sigma_p}}{\longrightarrow} && \Gamma_{\Sigma_p}(E)_{\delta_{EL}\mathbf{L} = 0} \\ & {}_{\mathllap{\tau_\Sigma}(A)}\searrow && \swarrow_{\mathrlap{ \tau_{\Sigma_p}(H) }} \\ && \mathbf{\Omega}^2 }

Beware that the local observable τ Σ p(H)\tau_{\Sigma_p}(H) defined by a Hamiltonian differential form HΩ Σ,Ham p,0(E)H \in \Omega^{p,0}_{\Sigma,Ham}(E) as in def. 89 does in general depend not just on the choice of HH, but also on the choice Σ p\Sigma_p of the Cauchy surface. The exception are those Hamiltonian forms which are conserved currents:

Proposition

(conserved chargestransgression of conserved currents)

Let (E,L)(E,\mathbf{L}) be a Lagrangian field theory (def. 60).

If a Hamiltonian differential form JΩ Σ,Ham p,0(E)J \in \Omega^{p,0}_{\Sigma,Ham}(E) (def. 70) happens to be a conserved current (def. 66) in that its total spacetime derivative vanishes on-shell

dJ| =0 d J \vert_{\mathcal{E}} \;= \; 0

then the induced Hamiltonian local observable τ Σ p(J)\tau_{\Sigma_p}(J) (def. 89) is independent of the choice of Cauchy surface Σ p\Sigma_p (def 87) in that for Σ p,Σ pΣ\Sigma_p, \Sigma'_p \hookrightarrow \Sigma any two Cauchy surfaces which are cobordant, then

τ Σ p(J)=τ Σ p(J). \tau_{\Sigma_p}(J) = \tau_{\Sigma'_p}(J) \,.

The resulting constant is called the conserved charge of the conserved current, traditionally denoted

Qτ Σ p(J). Q \;\coloneqq\; \tau_{\Sigma_p}(J) \,.
Proof

By definition the transgression of dJd J vanishes on the on-shell space of field histories. Therefore the result is given by Stokes' theorem (prop. 4).

Definition

(Poisson bracket of Hamiltonian local observables on covariant phase space)

Let (E,L)(E, \mathbf{L}) be a Lagrangian field theory (def. 60) where the field bundle EfbΣE \overset{fb}{\to} \Sigma is a trivial vector bundle over Minkowski spacetime (example 9).

We say that the Poisson bracket on Hamiltonian local observables (def. 89) is the transgression (def. 82) of the local Poisson bracket (def. 36) of the corresponding Hamiltonian differential forms (def. 36) to the covariant phase space (def. 46).

Explicitly: for Σ pΣ\Sigma_p \hookrightarrow \Sigma a choice of Cauchy surface (def. 87) then the Poisson bracket between two local Hamiltonian observables τ Σ p((H i,v i))\tau_{\Sigma_p}((H_i, v_i)) is

(109){τ Σ p((H 1,v 1)),τ Σ p((H 2,v 2))}τ Σ p({(H 1,v 1),(H 2,v 2)}), \left\{ \tau_{\Sigma_p}((H_1, v_1)) \,,\, \tau_{\Sigma_p}( (H_2, v_2) ) \right\} \;\coloneqq\; \tau_{\Sigma_p}( \, \{ (H_1, v_1), (H_2, v_2) \} \, ) \,,

where on the right we have the transgression of the local Poisson bracket {(H 1,v 1),(H 2,v 2)}\{(H_1, v_1), (H_2, v_2)\} of Hamiltonian differential forms on the jet bundle from prop. 36.

Proof

We need to see that equation (109) is well defined, in that it does not depend on the choice of Hamiltonian form (H i,v i)(H_i, v_i) representing the local Hamiltonian observable τ Σ p(H i)\tau_{\Sigma_p}(H_i).

It is clear that all the transgressions involved depend only on the restriction of the Hamiltonian forms to the pullback of the jet bundle to the infinitesimal neighbourhood N ΣΣ pN_\Sigma \Sigma_p. Moreover, the Poisson bracket on the jet bundle (76) clearly respects this restriction.

If a Hamiltonian differential form HH is in the kernel of the transgression map relative to Σ p\Sigma_p, in that for every smooth collection Φ ():UΓ Σ p(E) δ ELL=0\Phi_{(-)} \colon U \to \Gamma_{\Sigma_p}(E)_{\delta_{EL}\mathbf{L} = 0} of field histories (according to def. 37) we have (by def. 82)

Σ pj Σ (Φ ()) *H=0Ω p(U) \int_{\Sigma_p} j^\infty_\Sigma(\Phi_{(-)})^\ast H \;= \;0 \;\;\; \in \Omega^p(U)

then the fact that the kernel of integration is the exact differential forms says that j Σ (Φ ()) *HΩ p(U×Σ)j^\infty_\Sigma(\Phi_{(-)})^\ast H \in \Omega^p(U \times \Sigma) is d Σd_\Sigma-exact and hence in particular d Σd_\Sigma-closed for all Φ ()\Phi_{(-)}:

d Σj (Φ ()) *H=0. d_\Sigma j^\infty(\Phi_{(-)})^\ast H \;=\; 0 \,.

By prop. 20 this means that

j (Φ ()) *(dH)=0 j^\infty(\Phi_{(-)})^\ast ( d H ) \;= \; 0

for all Φ ()\Phi_{(-)}. Since HΩ Σ p,0(E)H \in \Omega^{p,0}_\Sigma(E) is horizontal, the same proposition (see also example 38) implies that in fact HH is horizontally closed:

dH=0. d H \;=\; 0 \,.

Now since the field bundle EfbΣE \overset{fb}{\to} \Sigma is trivial by assumption, prop. 21 applies and says that this horizontally closed form on the jet bundle is in fact horizontally exact.

In conclusion this shows that the kernel of the transgression map τ Σ p:Ω Σ p,0(E)C (Γ Σ p(E))\tau_{\Sigma_p} \;\colon\; \Omega^{p,0}_\Sigma(E) \to C^\infty\left( \Gamma_{\Sigma_p}(E)\right) is precisely the space of horizontally exact horizontal pp-forms.

Therefore the claim now follows with the statement that horizontally exact Hamiltonian differential forms constitute a Lie ideal for the local Poisson bracket on the jet bundle; this is lemma 2.

Example

(Poisson bracket of the real scalar field)

Consider the Lagrangian field theory of the free scalar field (example 39), and consider the Cauchy surface defined by x 0=0x^0 = 0.

By example 58 the local Poisson bracket of the Hamiltonian forms

Qϕι 0dvol ΣΩ p,0(E) Q \coloneqq \phi \iota_{\partial_0} dvol_\Sigma \in \Omega^{p,0}(E)

and

Pη μνϕ ,μι νdvol ΣΩ p,0(E). P \coloneqq \eta^{\mu \nu} \phi_{,\mu} \iota_{\partial_\nu} dvol_{\Sigma} \in \Omega^{p,0}(E) \,.

is

{Q,P}=ι v Qι v Pω=ι 0dvol Σ. \{Q,P\} = \iota_{v_Q} \iota_{v_P} \omega = \iota_{\partial_0} dvol_\Sigma \,.

Upon transgression according to def. 90 this yields the following Poisson bracket

{ Σ pb 1(x)ϕ(t,x)ι 0dvol Σ(x)d px, Σ pb 2(x) 0ϕ(t,x)ι 0dvol Σ(x)}= Σ pb 1(x)b 2(x)ι 0dvol Σ(x)d px, \left\{ \int_{\Sigma_p} b_1(\vec x) \phi(t,\vec x) \iota_{\partial_0} dvol_\Sigma(x) d^p \vec x \;,\; \int_{\Sigma_p} b_2(\vec x) \partial_0 \phi(t,\vec x) \iota_{\partial_0} dvol_\Sigma(\vec x) \right\} \;=\; \int_{\Sigma_p} b_1(\vec x) b_2(\vec x) \iota_{\partial_0} dvol_\Sigma(\vec x) d^p \vec x \,,

where

Φ(x), 0Φ(x):PhaseSpace(Σ p t) \mathbf{\Phi}(x), \partial_0 \mathbf{\Phi}(x) \;:\; PhaseSpace(\Sigma_p^t) \to \mathbb{R}

denote the point-evaluation observables (example 60), which act on a field history ΦΓ Σ(E)=C (Σ)\Phi \in \Gamma_\Sigma(E) = C^\infty(\Sigma) as

Φ(x):ΦΦ(x)AAAAAAAA 0Φ(x):Φ 0Φ(x). \mathbf{\Phi}(x) \;\colon\; \Phi \mapsto \Phi(x) \phantom{AAAAAAAA} \partial_0 \mathbf{\Phi}(x) \;\colon\; \Phi \mapsto \partial_0 \Phi(x) \,.

Notice that these point-evaluation functions themselves do not arise as the transgression of elements in Ω p,0(E)\Omega^{p,0}(E); only their smearings such as Σ pb 1ϕdvol Σ p\int_{\Sigma_p} b_1 \phi dvol_{\Sigma_p} do. Nevertheless we may express the above Poisson bracket conveniently via the integral kernel

(110){Φ(t,x), 0Φ(t,y)}=δ(xy). \left\{ \mathbf{\Phi}(t,\vec x), \partial_0\mathbf{\Phi}(t,\vec y) \right\} \;=\; \delta(\vec x - \vec y) \,.
Proposition

(super-Poisson bracket of the Dirac field)

Consider the Lagrangian field theory of the free Dirac field on Minkowski spacetime (example 43) with field bundle the odd-shifted spinor bundle E=Σ×S oddE = \Sigma \times S_{odd} (example 35) and with

θΨ α(x): 0|1[Γ Σ(Σ×S odd) δ ELL=0,] \theta \Psi_\alpha(x) \;\colon\; \mathbb{R}^{0\vert 1} \longrightarrow \left[ \Gamma_\Sigma(\Sigma \times S_{odd})_{\delta_{EL}\mathbf{L} = 0}, \mathbb{C} \right]

the corresponding odd-graded point-evaluation observable (example 60).

Then consider the Cauchy surfaces in Minkowski spacetime (def. 23) given by x 0=tx^0 = t for tt \in \mathbb{R}. Under transgression to this Cauchy surface via def. 90, the local Poisson bracket, which by example 59 is given by the super Lie bracket

{(γ μψ) αι μdvol Σ,(ψ¯γ μ) βι μdvol Σ}=(γ μ) α βι μdvol Σ, \left\{ \left( \gamma^\mu \psi \right)_\alpha \, \iota_{\partial_\mu} dvol_\Sigma \,,\, \left(\overline{\psi}\gamma^\mu\right)^\beta\, \iota_{\partial_\mu} dvol_\Sigma \right\} \;=\; \left(\gamma^\mu\right)_\alpha{}^{\beta} \, \iota_{\partial_\mu} dvol_\Sigma \,,

has integral kernel

{ψ α(t,x),ψ¯ β(t,y)}=(γ 0) α βδ(yx). \left\{ \psi_\alpha(t,\vec x) , \overline{\psi}^\beta(t,\vec y) \right\} \;=\; (\gamma^0)_{\alpha}{}^\beta \delta(\vec y - \vec x) \,.

\,

This concludes our discussion of the phase space and the Poisson-Peierls bracket for well behaved Lagrangian field theories. In the next chapter we discuss in detail the integral kernels corresponding to the Poisson-Peierls bracket for key classes of examples. These are the propagators of the theory.

\,

Propagators

In the previous chapter we have seen the covariant phase space (prop. 46) of sufficiently nice Lagrangian field theories, which is the on-shell space of field histories equipped with the presymplectic form transgressed from the presymplectic current of the theory; and we have seen that in good cases this induces a bilinear pairing on sufficiently well-behaved observables, called the Poisson bracket (def. 90), which reflects the infinitesimal symmetries of the presymplectic current. This Poisson bracket is of central importance for passing to actual quantum field theory, since, as we will discuss in Quantization below, it is the infinitesimal approximation to the quantization of a Lagrangian field theory.

We have moreover seen that the Poisson bracket on the covariant phase space of a free field theory with Green hyperbolic equations of motion – the Peierls-Poisson bracket – is determined by the integral kernel of the causal Green function (prop. 2). Under the identification of linear on-shell observables with off-shell observables that are generalized solutions to the equations of motion (theorem 1) the convolution with this integral kernel may be understood as propagating the values of an off-shell observable through spacetime, such as to then compare it with any other observable at any spacetime point (prop. 2). Therefore the integral kernel of the causal Green function is also called the causal propagator.

This means that for Green hyperbolic free Lagrangian field theory the Poisson bracket, and hence the infinitesimal quantization of the theory, is all encoded in the causal propagator. Therefore here we analyze the causal propagator, as well as its variant propagators, in detail.

The main tool for these computations is Fourier analysis (reviewed below) by which field histories, observables and propagators on Minkowski spacetime are decomposed as superpositions of plane waves of various frequencies, wave lengths and wave vector-direction. Using this, all propagators are exhibited as those superpositions of plane waves which satisfy the dispersion relation of the given equation of motion, relating plane wave frequency to wave length.

This way the causal propagator is naturally decomposed into its contribution from positive and from negative frequencies. The positive frequency part of the causal propagator is called the Hadamard propagator (def. 107 below). It turns out (prop. 69 below) that this is equivalently the sum of the causal propagator, which itself is skew-symmetric (cor. 2 below), with a symmetric component, or equivalently that the causal propagator is the skew-symmetrization of the Hadamard propagator. After quantization of free field theory discussed further below, we will see that the Hadamard propagator is equivalently the correlation function between two point-evaluation field observables (example 60) in a vacuum state of the field theory (a state in the sense of def. 86).

Moreover, by def. 78 the causal propagator also decomposes into its contributions with future and past support, given by the difference between the advanced and retarded propagators. These we analyze first, starting with prop. 64 below.

Combining these two decompositions of the causal propagator (positive/negative frequency as well as positive/negative time) yields one more propagator, the Feynman propagator (def. 108 below).

We will see below that the quantization of a free field theory is given by a “star product” (on observables) which is given by “exponentiating” these propagators. For that to make sense, certain pointwise products of these propagators, regarded as generalized functions (prop. 38) need to exist. But since the propagators are distributions with singularities, the existence of these products requires that certain potential “UV divergences” in their Fourier transforms are absent (“Hörmander's criterion”, prop. 58 below). These UV divergences are captured by what what is called the wave front set (def. 101 below).

The study of UV divergences of distributions via their wave front sets is called microlocal analysis and provides powerful tools for the understanding of quantum field theory. In particular the propagation of singularities theorem (prop. 59) shows that for distributional solutions (def. 77) of Euler-Lagrange equations of motion, such as the propagators, their singular support propagates itself through spacetime along the wave front set.

Using this theorem we work out the wave front sets of the propagators (prop. 75 below). Via Hörmander's criterion (prop. 58) this computation will serve to show why upon quantization the Hadamard propagator replaces the causal propagator in the construction of the Wick algebra of quantum observables of the free field theory (discussed below in Free quantum fields) and the the Feynman propagator similarly controls the quantum observables of the interacting quantum field theory (below in Feynman diagrams).

\,

The following table summarizes the structure of the system of propagators. (The column “as vacuum expectation value of field operators” will be discussed further below in Free quantum fields).

\,

propagators (i.e. integral kernels of Green functions)
for the wave operator and Klein-Gordon operator
on a globally hyperbolic spacetime such as Minkowski spacetime:

namesymbolwave front setas vacuum exp. value
of field operators
as a product of
field operators
causal propagatorΔ S =Δ +Δ \begin{aligned}\Delta_S & = \Delta_+ - \Delta_- \end{aligned}
A\phantom{A}\,\,\,-
iΔ S(x,y)= [Φ(x),Φ(y)]\begin{aligned} & i \hbar \, \Delta_S(x,y) = \\ & \left\langle \;\left[\mathbf{\Phi}(x),\mathbf{\Phi}(y)\right]\; \right\rangle \end{aligned} Peierls-Poisson bracket
advanced propagatorΔ +\Delta_+ iΔ +(x,y)= {[Φ(x),Φ(y)] | xy 0 | yx\begin{aligned} & i \hbar \, \Delta_+(x,y) = \\ & \left\{ \array{ \left\langle \; \left[ \mathbf{\Phi}(x),\mathbf{\Phi}(y) \right] \; \right\rangle &\vert& x \geq y \\ 0 &\vert& y \geq x } \right. \end{aligned} future part of
Peierls-Poisson bracket
retarded propagatorΔ \Delta_- iΔ (x,y)= {[Φ(x),Φ(y)] | yx 0 | xy\begin{aligned} & i \hbar \, \Delta_-(x,y) = \\ & \left\{ \array{ \left\langle \; \left[\mathbf{\Phi}(x),\mathbf{\Phi}(y) \right] \; \right\rangle &\vert& y \geq x \\ 0 &\vert& x \geq y } \right. \end{aligned}past part of
Peierls-Poisson bracket
Hadamard propagatorΔ H =i2(Δ +Δ )+H =i2Δ S+H =Δ FiΔ \begin{aligned} \Delta_H &= \tfrac{i}{2}\left( \Delta_+ - \Delta_-\right) + H\\ & = \tfrac{i}{2}\Delta_S + H \\ & = \Delta_F - i \Delta_- \end{aligned} Δ H(x,y) =Φ(x)Φ(y) =:Φ(x)Φ(y):=0 =+[Φ ()(x),Φ (+)(y)]\begin{aligned} & \hbar \, \Delta_H(x,y) \\ & = \left\langle \; \mathbf{\Phi}(x) \mathbf{\Phi}(y) \; \right\rangle \\ & = \underset{ = 0 }{\underbrace{\left\langle \; : \mathbf{\Phi}(x) \mathbf{\Phi}(y) : \; \right\rangle}} \\ & \phantom{=} + \left\langle \; \left[ \mathbf{\Phi}^{(-)}(x), \mathbf{\Phi}^{(+)}(y) \right] \; \right\rangle \end{aligned} positive frequency of
Peierls-Poisson bracket,
normal-ordered product,
2-point function
=\phantom{=} of vacuum state
=\phantom{=} or generally of
=\phantom{=} Hadamard state
Dirac propagatorΔ D =12(Δ ++Δ )\begin{aligned}\Delta_D &= \tfrac{1}{2}(\Delta_+ + \Delta_-) \end{aligned}
A+\phantom{A}\,\,\, +
would-be
time-ordered product
away from
coincident points
Feynman propagatorΔ F =i2(Δ ++Δ )+H =iΔ D+H =Δ H+iΔ \begin{aligned}\Delta_F & = \tfrac{i}{2}\left( \Delta_+ + \Delta_- \right) + H \\ & = i \Delta_D + H \\ & = \Delta_H + i \Delta_- \end{aligned} Δ F(x,y) =T(Φ(x)Φ(y)) ={Φ(x)Φ(x) | xy Φ(y)Φ(x) | yx\begin{aligned} & \hbar \, \Delta_F(x,y) \\ & = \left\langle \; T\left( \; \mathbf{\Phi}(x)\mathbf{\Phi}(y) \;\right) \; \right\rangle \\ & = \left\{ \array{ \left\langle \; \mathbf{\Phi}(x)\mathbf{\Phi}(x) \; \right\rangle &\vert& x \geq y \\ \left\langle \; \mathbf{\Phi}(y) \mathbf{\Phi}(x) \; \right\rangle &\vert& y \geq x } \right.\end{aligned}time-ordered product

(see also Kocic’s overview: pdf)

\,

We now discuss these topics:

\,

Fourier analysis and plane wave modes

By definition, the equations of motion of free field theories (def. 62) are linear partial differential equations and hence lend themselves to harmonic analysis, where all field histories are decomposed into superpositions of plane waves via Fourier transform. Here we briefly survey the relevant definitions and facts of Fourier analysis.

In formal duality to the harmonic analysis of the field histories themselves, also the linear observables (def. 72) on the space of field histories, hence the distributional generalized functions (prop. 37) are subject to Fourier transform of distributions (def. 96 below).

Throughout, let nn \in \mathbb{N} and consider the Cartesian space n\mathbb{R}^n of dimension nn (def. 1). In the application to field theory, n=p+1n = p + 1 is the dimension of spacetime and n\mathbb{R}^n is either Minkowski spacetime p,1\mathbb{R}^{p,1} (def. 23) or its dual vector space, thought of as the space of wave vectors (def. 91 below). For x=(x μ) p,1x = (x^\mu) \in \mathbb{R}^{p,1} and k=(k μ)( (p,1)) *k = (k_\mu) \in (\mathbb{R}^(p,1))^\ast we write

xk=x μk μ x \cdot k \;=\; x^\mu k_\mu

for the canonical pairing.

Definition

(plane wave)

A plane wave on Minkowski spacetime p,1\mathbb{R}^{p,1} (def. 23) is a smooth function with values in the complex numbers given by

p,1 (x μ) e ik μx μ \array{ \mathbb{R}^{p,1} &\longrightarrow& \mathbb{C} \\ (x^\mu) &\mapsto& e^{i k_\mu x^\mu} }

for k=(k μ)( p,1) *k = (k_\mu) \in (\mathbb{R}^{p,1})^\ast a covector, called the wave vector of the plane wave.

We use the following terminology:

plane waves on Minkowski spacetime

p,1 ψ k x exp(ik μx μ) (x,x 0) exp(ikx+ik 0x 0) (x,ct) exp(ikxiωt) \array{ \mathbb{R}^{p,1} &\overset{\psi_k}{\longrightarrow}& \mathbb{C} \\ x &\mapsto& \exp\left( \, i k_\mu x^\mu \, \right) \\ (\vec x, x^0) &\mapsto& \exp\left( \, i \vec k \cdot \vec x + i k_0 x^0 \, \right) \\ (\vec x, c t) &\mapsto& \exp\left( \, i \vec k \cdot \vec x - i \omega t \, \right) }
symbolname
ccspeed of light
\hbarPlanck's constant
\,\,
mmmass
mc\frac{\hbar}{m c}Compton wavelength
\,\,
kk, k\vec kwave vector
λ=2π/|k|\lambda = 2\pi/{\vert \vec k \vert}wave length
|k|=2π/λ{\vert \vec k \vert} = 2\pi/\lambdawave number
ωk 0c=k 0c=2πν\omega \coloneqq k^0 c = -k_0 c = 2\pi \nuangular frequency
ν=ω/2π\nu = \omega / 2 \pifrequency
p=kp = \hbar k, p=k\vec p = \hbar \vec kmomentum
E=ωE = \hbar \omegaenergy
ω(k)=ck 2+(mc) 2\omega(\vec k) = c \sqrt{ \vec k^2 + \left(\frac{m c}{\hbar}\right)^2 }Klein-Gordon dispersion relation
E(p)=c 2p 2+(mc 2) 2E(\vec p) = \sqrt{ c^2 \vec p^2 + (m c^2)^2 }energy-momentum relation
Definition

(Schwartz space of functions with rapidly decreasing partial derivatives)

A complex-valued smooth function fC ( n)f \in C^\infty(\mathbb{R}^n) is said to have rapidly decreasing partial derivatives if for all α,β n\alpha,\beta \in \mathbb{N}^{n} we have

supx n|x β αf(x)|<. \underset{x \in \mathbb{R}^n}{sup} {\vert x^\beta \partial^\alpha f(x) \vert} \;\lt\; \infty \,.

Write

𝒮( n)C ( n) \mathcal{S}(\mathbb{R}^n) \hookrightarrow C^\infty(\mathbb{R}^n)

for the sub-vector space on the functions with rapidly decreasing partial derivatives regarded as a topological vector space for the Fréchet space structure induced by the seminorms

p α,β(f)supx n|x β αf(x)|. p_{\alpha, \beta}(f) \coloneqq \underset{x \in \mathbb{R}^n}{sup} {\vert x^\beta \partial^\alpha f(x) \vert} \,.

This is also called the Schwartz space.

(e.g. Hörmander 90, def. 7.1.2)

Example

(compactly supported smooth function are functions with rapidly decreasing partial derivatives)

Every compactly supported smooth function (bump function) bC cp ( n)b \in C^\infty_{cp}(\mathbb{R}^n) has rapidly decreasing partial derivatives (def. 92):

C ( n)𝒮( n). C^\infty(\mathbb{R}^n) \hookrightarrow \mathcal{S}(\mathbb{R}^n) \,.
Proposition

(pointwise product and convolution product on Schwartz space)

The Schwartz space 𝒮( n)\mathcal{S}(\mathbb{R}^n) (def. 92) is closed under the following operatios on smooth functions f,g𝒮( n)C ( n)f,g \in \mathcal{S}(\mathbb{R}^n) \hookrightarrow C^\infty(\mathbb{R}^n)

  1. pointwise product:

    (fg)(x)f(x)g(x) (f \cdot g)(x) \coloneqq f(x) \cdot g(x)
  2. convolution product:

    (fg)(x)y nf(y)g(xy)dvol(y). (f \star g)(x) \coloneqq \underset{y \in \mathbb{R}^n}{\int} f(y)\cdot g(x-y) \, dvol(y) \,.
Proof

By the product law of differentiation.

Proposition

(rapidly decreasing functions are integrable)

Every rapidly decreasing function f: nf \colon \mathbb{R}^n \to \mathbb{R} (def. 92) is an integrable function in that its integral exists:

x nf(x)d nx< \underset{x \in \mathbb{R}^n}{\int} f(x) \, d^n x \;\lt\; \infty

In fact for each α n\alpha \in \mathbb{N}^n the product of ff with the α\alpha-power of the coordinate functions exists:

x nx αf(x)d nx<. \underset{x \in \mathbb{R}^n}{\int} x^\alpha f(x)\, d^n x \;\lt\; \infty \,.
Definition

(Fourier transform of functions with rapidly decreasing partial derivatives)

The Fourier transform is the continuous linear functional

()^:𝒮( n)𝒮( n) \widehat{(-)} \;\colon\; \mathcal{S}(\mathbb{R}^n) \longrightarrow \mathcal{S}(\mathbb{R}^n)

on the Schwartz space of functions with rapidly decreasing partial derivatives (def. 92), which is given by integration against plane wave functions (def. 91)

xe ikx x \mapsto e^{- i k \cdot x}

times the standard volume form d nxd^n x:

(111)f^(k): x ne ikxf(x)d nx. \hat f(k) \;\colon\; \int_{x \in \mathbb{R}^n} e^{- i \, k \cdot x} f(x) \, d^n x \,.

Here the argument k nk \in \mathbb{R}^n of the Fourier transform is also called the wave vector.

(e.g. Hörmander, lemma 7.1.3)

Proposition

(Fourier inversion theorem)

The Fourier transform ()^\widehat{(-)} (def. 93) on the Schwartz space 𝒮( n)\mathcal{S}(\mathbb{R}^n) (def. 92) is an isomorphism, with inverse function the inverse Fourier transform

()ˇ:𝒮( n)𝒮( n) \widecheck {(-)} \;\colon\; \mathcal{S}(\mathbb{R}^n) \longrightarrow \mathcal{S}(\mathcal{R}^n)

given by

gˇ(x)k ng(k)e ikxd nk(2π) n. \widecheck g (x) \;\coloneqq\; \underset{k \in \mathbb{R}^n}{\int} g(k) e^{i k \cdot x} \, \frac{d^n k}{(2\pi)^n} \,.

Hence in the language of harmonic analysis the function gˇ: n\widecheck g \colon \mathbb{R}^n \to \mathbb{C} is the superposition of plane waves (def. 91) in which the plane wave with wave vector k nk\in \mathbb{R}^n appears with amplitude g(k)g(k).

(e.g. Hörmander, theorem 7.1.5)

Proposition

(basic properties of the Fourier transform)

The Fourier transform ()^\widehat{(-)} (def. 93) on the Schwartz space 𝒮( n)\mathcal{S}(\mathbb{R}^n) (def. 92) satisfies the following properties, for all f,g𝒮( n)f,g \in \mathcal{S}(\mathbb{R}^n):

  1. (interchanging coordinate multiplication with partial derivatives)

    (112)x af^=+i af^AAAAAi af^=k af^ \widehat{ x^a f } = + i \partial_a \widehat f \phantom{AAAAA} \widehat{ - i\partial_a f} = k_a \widehat f
  2. (interchanging pointwise multiplication with convolution product, remark 50):

    (113)(fg)^=f^g^AAAAfg^=(2π) nf^g^ \widehat {(f \star g)} = \widehat{f} \cdot \widehat{g} \phantom{AAAA} \widehat{ f \cdot g } = (2\pi)^{-n} \widehat{f} \star \widehat{g}
  3. (unitarity, Parseval's theorem)

    x nf(x)g *(x)d nx=k nf^(k)g^ *(k)d nk \underset{x \in \mathbb{R}^n}{\int} f(x) g^\ast(x)\, d^n x \;=\; \underset{k \in \mathbb{R}^n}{\int} \widehat{f}(k) \widehat{g}^\ast(k) \, d^n k
  4. (114)k nf^(k)g(k)d nk=x nf(x)g^(x)d nx \underset{k \in \mathbb{R}^n}{\int} \widehat{f}(k) \cdot g(k) \, d^n k \;=\; \underset{x \in \mathbb{R}^n}{\int} f(x) \cdot \widehat{g}(x) \, d^n x

(e.g Hörmander 90, lemma 7.1.3, theorem 7.1.6)

The Schwartz space of functions with rapidly decreasing partial derivatives (def. 92) serves the purpose to support the Fourier transform (def. 93) together with its inverse (prop. 94), but for many applications one needs to apply the Fourier transform to more general functions, and in fact to generalized functions in the sense of distributions (via this prop.). But with the Schwartz space in hand, this generalization is readily obtained by formal duality:

Definition

(tempered distribution)

A tempered distribution is a continuous linear functional

u:𝒮( n) u \;\colon\; \mathcal{S}(\mathbb{R}^n) \longrightarrow \mathbb{C}

on the Schwartz space (def. 92) of functions with rapidly decaying partial derivatives. The vector space of all tempered distributions is canonically a topological vector space as the dual space to the Schwartz space, denoted

𝒮( n)(𝒮( n)) *. \mathcal{S}'(\mathbb{R}^n) \;\coloneqq\; \left( \mathcal{S}(\mathbb{R}^n) \right)^\ast \,.

e.g. (Hörmander 90, def. 7.1.7)

Example

(some non-singular tempered distributions)

Every function with rapidly decreasing partial derivatives f𝒮( n)f \in \mathcal{S}(\mathbb{R}^n) (def. 92) induces a tempered distribution u f𝒮( n)u_f \in \mathcal{S}'(\mathbb{R}^n) (def. 95) by integrating against it:

u f:gx ng(x)f(x)d nx. u_f \;\colon\; g \mapsto \underset{x \in \mathbb{R}^n}{\int} g(x) f(x)\, d^n x \,.

This construction is a linear inclusion

𝒮( n)dense𝒮( n) \mathcal{S}(\mathbb{R}^n) \overset{\text{dense}}{\hookrightarrow} \mathcal{S}'(\mathbb{R}^n)

of the Schwartz space into its dual space of tempered distributions. This is a dense subspace inclusion.

In fact already the restriction of this inclusion to the compactly supported smooth functions (example 77) is a dense subspace inclusion:

C cp ( n)dense𝒮( n). C^\infty_{cp}(\mathbb{R}^n) \overset{dense}{\hookrightarrow} \mathcal{S}'(\mathbb{R}^n) \,.

This means that every tempered distribution is a limit of a sequence of ordinary functions with rapidly decreasing partial derivatives, and in fact even the limit of a sequence of compactly supported smooth functions (bump functions).

It is in this sense that tempered distributions are “generalized functions”.

(e.g. Hörmander 90, lemma 7.1.8)

Example

(compactly supported distributions are tempered distributions)

Every compactly supported distribution is a tempered distribution (def. 95), hence there is a linear inclusion

( n)𝒮( n). \mathcal{E}'(\mathbb{R}^n) \hookrightarrow \mathcal{S}'(\mathbb{R}^n) \,.
Example

(delta distribution)

Write

δ 0()( n) \delta_0(-) \;\in\; \mathcal{E}'(\mathbb{R}^n)

for the distribution given by point evaluation of functions at the origin of n\mathbb{R}^n:

δ 0():ff(0). \delta_0(-) \;\colon\; f \mapsto f(0) \,.

This is clearly a compactly supported distribution; hence a tempered distribution by example 79.

We write just “δ()\delta(-)” (without the subscript) for the corresponding generalized function (example 78), so that

x nδ(x)f(x)d nxf(0). \underset{x \in \mathbb{R}^n}{\int} \delta(x) f(x) \, d^n x \;\coloneqq\; f(0) \,.
Example

(square integrable functions induce tempered distributions)

Let fL p( n)f \in L^p(\mathbb{R}^n) be a function in the ppth Lebesgue space, e.g. for p=2p = 2 this means that ff is a square integrable function. Then the operation of integration against the measure fdvolf dvol

gx ng(x)f(x)d nx g \mapsto \underset{x \in \mathbb{R}^n}{\int} g(x) f(x) \, d^n x

is a tempered distribution (def. 95).

(e.g. Hörmander 90, below lemma 7.1.8)

Property (114) of the ordinary Fourier transform on functions with rapidly decreasing partial derivatives motivates and justifies the fullowing generalization:

Definition

(Fourier transform of distributions on tempered distributions)

The Fourier transform of distributions of a tempered distribution u𝒮( n)u \in \mathcal{S}'(\mathbb{R}^n) (def. 95) is the tempered distribution u^\widehat u defined on a smooth function f𝒮( n)f \in \mathcal{S}(\mathbb{R}^n) in the Schwartz space (def. 92) by

u^(f)u(f^), \widehat{u}(f) \;\coloneqq\; u\left( \widehat f\right) \,,

where on the right f^𝒮( n)\widehat f \in \mathcal{S}(\mathbb{R}^n) is the Fourier transform of functions from def. 93.

(e.g. Hörmander 90, def. 1.7.9)

Example

(Fourier transform of distributions indeed generalizes Fourier transform of functions with rapidly decreasing partial derivatives)

Let u f𝒮( n)u_f \in \mathcal{S}'(\mathbb{R}^n) be a non-singular tempered distribution induced, via example 78, from a function with rapidly decreasing partial derivatives f𝒮( n)f \in \mathcal{S}(\mathbb{R}^n).

Then its Fourier transform of distributions (def. 96) is the non-singular distribution induced from the Fourier transform of ff:

u f^=u f^. \widehat{u_f} \;=\; u_{\hat f} \,.
Proof

Let g𝒮( n)g \in \mathcal{S}(\mathbb{R}^n). Then

u f^(g) u f(g^) =x nf(x)g^(x)d nx =x nf^(x)g(x)d nx =u f^(g) \begin{aligned} \widehat{u_f}(g) & \coloneqq u_f\left( \widehat{g}\right) \\ & = \underset{x \in \mathbb{R}^n}{\int} f(x) \hat g(x)\, d^n x \\ & = \underset{x \in \mathbb{R}^n}{\int} \hat f(x) g(x) \, d^n x \\ & = u_{\hat f}(g) \end{aligned}

Here all equalities hold by definition, except for the third: this is property (114) from prop. 52.

Example

(Fourier transform of Klein-Gordon equation of distributions)

Let Δ𝒮( p,1)\Delta \in \mathcal{S}'(\mathbb{R}^{p,1}) be any tempered distribution (def. 95) on Minkowski spacetime (def. 23) and let Pη μνx μx ν(mc) 2P \coloneqq \eta^{\mu \nu} \frac{\partial}{\partial x^\mu}\frac{\partial}{\partial x^\nu} - \left( \tfrac{m c}{\hbar} \right)^2 be the Klein-Gordon operator (65). Then the Fourier transform (def. 96) of PΔP \Delta is, in generalized function-notation (remark 19)given by

PΔ^(k)=(η μνk μk ν(mc) 2)(^k). \widehat {P \Delta}(k) \;=\; \left( - \eta^{\mu \nu}k_\mu k_\nu - \left( \tfrac{m c}{\hbar}\right)^2 \right) \widehat(k) \,.
Proof

Let r𝒮( n)r \in \mathcal{S}(\mathbb{R}^n) be any function with rapidly decreasing partial derivatives (def. 92). Then

PΔ^(r) =PΔ(r^) =Δ(P *r^) =Δ(Pr^) =Δ((η μνk μk ν(mc) 2)r^) \begin{aligned} \widehat {P \Delta}(r) & = P \Delta(\widehat r) \\ & = \Delta(P^\ast \widehat r) \\ & = \Delta(P \widehat r) \\ & = \Delta\left( \left(-\eta^{\mu \nu}k_\mu k_\nu - \left( \tfrac{m c}{\hbar}\right)^2\right) \widehat{r} \right) \end{aligned}

Here the first step is def. 96, the second is def. 77, the third is example 51, while the last step is prop. 52.

Example

(Fourier transform of compactly supported distributions)

Under the identification of smooth functions of bounded growth with non-singular tempered distributions (example 78), the Fourier transform of distributions (def. 96) of a tempered distribution that happens to be compactly supported (example 79)

u( n)𝒮( n) u \in \mathcal{E}'(\mathbb{R}^n) \hookrightarrow \mathcal{S}'(\mathbb{R}^n)

is simply

u^(k)=u(e ik()). \widehat{u}(k) = u\left( e^{- i k \cdot (-)}\right) \,.

(Hörmander 90, theorem 7.1.14)

Example

(Fourier transform of the delta-distribution)

The Fourier transform (def. 96) of the delta distribution (def. 80), via example 84, is the constant function on 1:

δ^(k) =x nδ(x)e ikxdx =1 \begin{aligned} \widehat {\delta}(k) & = \underset{x \in \mathbb{R}^n}{\int} \delta(x) e^{- i k x} \, d x \\ & = 1 \end{aligned}

This implies by the Fourier inversion theorem (prop. 54) that the delta distribution itself has equivalently the following expression as a generalized function

δ(x) =δ 0^ˇ(x) =k ne ikxd nk(2π) n \begin{aligned} \delta(x) & = \widecheck{\widehat {\delta_0}}(x) \\ & = \underset{k \in \mathbb{R}^n}{\int} e^{i k \cdot x} \, \frac{d^n k}{ (2\pi)^n } \end{aligned}

in the sense that for every function with rapidly decreasing partial derivatives f𝒮( n)f \in \mathcal{S}(\mathbb{R}^n) (def. 92) we have

f(x) =y nf(y)δ(yx)d ny =y nk nf(y)e ik(yx)d nk(2π) nd ny =k ne ikxy nf(y)e ikyd ny=f^(k)d nk(2π) n =+k ne ikxy nf(y)e ikyd ny=f^(k)d nk(2π) n =f^ˇ(x) \begin{aligned} f(x) & = \underset{y \in \mathbb{R}^n}{\int} f(y) \delta(y-x) \, d^n y \\ & = \underset{y \in \mathbb{R}^n}{\int} \underset{k \in \mathbb{R}^n}{\int} f(y) e^{i k \cdot (y-x)} \, \frac{d^n k}{(2\pi)^n} \, d^n y \\ & = \underset{k \in \mathbb{R}^n}{\int} e^{- i k \cdot x} \underset{= \widehat{f}(-k) }{ \underbrace{ \underset{y \in \mathbb{R}^n}{\int} f(y) e^{i k \cdot y} \, d^n y } } \,\, \frac{d^n k}{(2\pi)^n} \\ & = + \underset{k \in \mathbb{R}^n}{\int} e^{i k \cdot x} \underset{= \widehat{f}(k) }{ \underbrace{ \underset{y \in \mathbb{R}^n}{\int} f(y) e^{- i k \cdot y} \, d^n y } } \,\, \frac{d^n k}{(2\pi)^n} \\ & = \widecheck{\widehat{f}}(x) \end{aligned}

which is the statement of the Fourier inversion theorem for smooth functions (prop. 94).

(Here in the last step we used change of integration variables kkk \mapsto -k which introduces one sign (1) n(-1)^{n} for the new volume form, but another sign (1) n(-1)^n from the re-orientation of the integration domain. )

Equivalently, the above computation shows that the delta distribution is the neutral element for the convolution product of distributions.

Proposition

(Paley-Wiener-Schwartz theorem)

Let u( n)𝒮( n)u \in \mathcal{E}'(\mathbb{R}^n) \hookrightarrow \mathcal{S}'(\mathbb{R}^n) be a compactly supported distribution regarded as a tempered distribution by example 79. Then its Fourier transform of distributions (def. 96) is a non-singular distribution induced from a smooth function that grows at most exponentially.

Proposition

(Fourier inversion theorem for Fourier transform of distributions)

The operation of forming the Fourier transform of distributions u^\widehat{u} (def. 96) tempered distributions u𝒮( n)u \in \mathcal{S}'(\mathbb{R}^n) (def. 95) is an isomorphism, with inverse given by

uˇ:gu(gˇ), \widecheck{ u } \;\colon\; g \mapsto u\left( \widecheck{g}\right) \,,

where on the right gˇ\widecheck{g} is the ordinary inverse Fourier transform of gg according to prop. 94.

Proof

By def. 96 this follows immediately from the Fourier inversion theorem for smooth functions (prop. 94).

We have the following distributional generalization of the basic property (113) from prop. 52:

Proposition

(Fourier transform of distributions interchanges convolution of distributions with pointwise product)

Let

u 1𝒮( n) u_1 \in \mathcal{S}'(\mathbb{R}^n)

be a tempered distribution (def. 95) and

u 2( n)𝒮( n) u_2 \in \mathcal{E}'(\mathbb{R}^n) \hookrightarrow \mathcal{S}'(\mathbb{R}^n)

be a compactly supported distribution, regarded as a tempered distribution via example 79.

Observe here that the Paley-Wiener-Schwartz theorem (prop. 53) implies that the Fourier transform of distributions of u 1u_1 is a non-singular distribution u 1^C ( n)\widehat{u_1} \in C^\infty(\mathbb{R}^n) so that the product u 1^u 2^\widehat{u_1} \cdot \widehat{u_2} is always defined.

Then the Fourier transform of distributions of the convolution product of distributions is the product of the Fourier transform of distributions:

u 1u 2^=u 1^u 2^. \widehat{u_1 \star u_2} \;=\; \widehat{u_1} \cdot \widehat{u_2} \,.

(e.g. Hörmander 90, theorem 7.1.15)

Remark

(product of distributions via Fourier transform of distributions)

Prop. 55 together with the Fourier inversion theorem (prop. 54) suggests to define the product of distributions u 1u 2u_1 \cdot u_2 for compactly supported distributions u 1,u 2( n)𝒮( n)u_1, u_2 \in \mathcal{E}'(\mathbb{R}^n) \hookrightarrow \mathcal{S}'(\mathbb{R}^n) by the formula

u 1u 2^(2π) nu 1^u 2^ \widehat{ u_1 \cdot u_2 } \;\coloneqq\; (2\pi)^n \widehat{u_1} \star \widehat{u_2}

which would complete the generalization of of property (113) from prop. 52.

For this to make sense, the convolution product of the smooth functions on the right needs to exist, which is not guaranteed (prop. 50 does not apply here!). The condition that this exists is the Hörmander criterion on the wave front set of u 1u_1 and u 2u_2. This we further discuss in Microlocal analysis and UV-Divergences below.

\,

microlocal analysis and ultraviolet divergences

A distribution (def. 37) or generalized function (prop. 38) is like a smooth function which may have “singularities”, namely points at which it values or that of its derivatives “become infinite”. Conversely, smooth functions are the non-singular distributions (prop. 38). The collection of points around which a distribution is singular (i.e. not non-singular) is called its singular support (def. 99 below).

The Fourier transform of distributions (def. 96) decomposes a generalized function into the plane wave modes that it is made of (def. 91). The Paley-Wiener-Schwartz theorem (prop. 56 below) says that the singular nature of a compactly supported distribution may be read off from this Fourier mode decomposition: Singularities correspond to large contributions by Fourier modes of high frequency and small wavelength, hence to large “ultraviolet” (UV) contributions. Therefore the singular support of a distribution is the set of points around which the Fourier transform does not sufficiently decay “in the UV”.

But since the Fourier transform is a function of the full wave vector of the plane wave modes (def. 91), not just of the frequency/wavelength, but also of the direction of the wave vector, this means that it contains directional information about the singularities: A distribution may have UV-singularities at some point and in some wave vector direction, but maybe not in other directions.

In particular, if the distribution in question is a distributional solution to a partial differential equation (def. 77) on spacetime then the propagation of singularities theorem (prop. 59 below) says that the singular support of the solution evolves in spacetime along the direction of those wave vectors in which the Fourier transform exhibits high UV constributions. This means that these directions are the “wave front” of the distributional solution. Accordingly, the singular support of a distribution together with, over each of its points, the directions of wave vectors in which the Fourier transform around that point has large UV constributions is called the wave front set of the distribution (def. 101 below).

What is called microlocal analysis is essentially the analysis of distributions with attention to their wave front set, hence to the wave vector-directions of UV divergences.

In particular the product of distributions is well defined (only) if the wave front sets of the distributions to not “collide”. And this in fact motivates the definition of the wave front set:

To see this, let u,v𝒟( 1)u,v \in \mathcal{D}'(\mathbb{R}^1) be two distributions, for simplicity of exposition taken on the real line.

Since the product uvu \cdot v, is, if it exists, supposed to generalize the pointwise product of smooth functions, it must be fixed locally: for every point xx \in \mathbb{R} there ought to be a compactly supported smooth function (bump function) bC cp ()b \in C^\infty_{cp}(\mathbb{R}) with f(x)=1f(x) = 1 such that

b 2uv=(bu)(bv). b^2 u \cdot v = (b u) \cdot (b v) \,.

But now bvb v and bub u are both compactly supported distributions (def. 100 below), and these have the special property that their Fourier transforms bv^\widehat{b v} and bu^\widehat{b u} are, in particular, smooth functions (by the Paley-Wiener-Schwartz theorem, prop 53).

Moreover, the operation of Fourier transform interchanges pointwise products with convolution products (prop. 52). This means that if the product of distributions uvu \cdot v exists, it must locally be given by the inverse Fourier transform of the convolution product of the Fourier transforms bu^\widehat {b u} and b^v\widehat b v:

b 2uv^(x)=limk max k max k max(bu)^(k)(bv)^(xk)dk. \widehat{ b^2 u \cdot v }(x) \;=\; \underset{\underset{k_{max} \to \infty}{\longrightarrow}}{\lim} \, \int_{- k_{max}}^{k_{max}} \widehat{(b u)}(k) \widehat{(b v)}(x - k) d k \,.

(Notice that the converse of this formula holds as a fact by prop. 55)

This shows that the product of distributions exists once there is a bump function bb such that the integral on the right converges as k maxk_{max} \to \infty.

Now the Paley-Wiener-Schwartz theorem says more, it says that the Fourier transforms bu^\widehat {b u} and bu^\widehat {b u} are polynomially bounded. On the other hand, the integral above is well defined if the integrand decreases at least quadratically with kk \to \infty. This means that for the convolution product to be well defined, either bu^\widehat {b u} has to polynomially decrease faster with k±k \to \pm \infty than bv^\widehat {b v} grows in the other direction, kk \to \mp \infty (due to the minus sign in the argument of the second factor in the convolution product), or the other way around.

Moreover, the degree of polynomial growth of the Fourier transform increases by one with each derivative. Therefore if the product law for derivatives of distributions is to hold generally, we need that either bu^\widehat{b u} or bv^\widehat{b v} decays faster than any polynomial in the opposite of the directions in which the respective other factor does not decay.

Here the set of directions of wave vectors in which the Fourier transform of a distribution localized around any point does not decay exponentially is the wave front set of a distribution (def. 101 below). Hence the condition that the product of two distributions is well defined is that for each wave vector direction in the wave front set of one of the two distributions, the opposite direction must not be an element of the wave front set of the other distribution. This is called Hörmander's criterion (prop. 58 below).

We now say this in detail:

Definition

(restriction of distributions)

For U nU \subset \mathbb{R}^n a subset, and u𝒟( n)u \in \mathcal{D}'(\mathbb{R}^n) a distribution, then the restriction of uu to UU is the distribution

u| U𝒟(U) u\vert_U \in \mathcal{D}'(U)

give by restricting uu to test functions whose support is in UU.

Definition

(singular support of a distribution)

Given a distribution u𝒟( n)u \in \mathcal{D}'(\mathbb{R}^n), a point x nx \in \mathbb{R}^n is a singular point if there is no neighbourhood U nU \subset \mathbb{R}^n of xx such that the restriction u| Uu\vert_U (def. 98) is a non-singular distribution (given by a smooth function).

The set of all singular points is the singular support supp sing(u) nsupp_{sing}(u) \subset \mathbb{R}^n of uu.

Definition

(product of a distribution with a smooth function)

Let u𝒟( n)u \in \mathcal{D}'(\mathbb{R}^n) be a distribution, and fC ( n)f \in C^\infty(\mathbb{R}^n) a smooth function. Then the product fu𝒟( n)f u \in \mathcal{D}'(\mathbb{R}^n) is the evident distribution given on a test function bC cp ( n)b \in C^\infty_{cp}(\mathbb{R}^n) by

fu:uu(fb) f u \;\colon\; u \mapsto u(f \cdot b) \,
Proposition

(Paley-Wiener-Schwartz theorem – decay of Fourier transform of compactly supported functions)

A compactly supported distribution u( n)u \in \mathcal{E}'(\mathbb{R}^n) is non-singular, hence given by a compactly supported function bC cp ( n)b \in C^\infty_{cp}(\mathbb{R}^n) via u(f)=b(x)f(x)dvol(x)u(f) = \int b(x) f(x) dvol(x), precisely if its Fourier transform u^\hat u (this def.) satisfies the following decay property:

For all NN \in \mathbb{N} there exists C N +C_N \in \mathbb{R}_+ such that for all k nk \in \mathbb{R}^n we have that the absolute value |v^(k)|{\vert \hat v(k)\vert} of the Fourier transform at that point is bounded by

(115)|v^(k)|C N(1+|k|) N. {\vert \hat v(k)\vert} \;\leq\; C_N \left( 1 + {\vert k\vert} \right)^{-N} \,.

(Hörmander 90, around (8.1.1))

Definition

(wavefront set)

Let u𝒟( n)u \in \mathcal{D}'(\mathbb{R}^n) be a distribution.

For bC cp ( n)b \in C^\infty_{cp}(\mathbb{R}^n) a compactly supported smooth function, write bu( n)b u \in \mathcal{E}'(\mathbb{R}^n) for the corresponding product (def. 100), which is now a compactly supported distribution.

For xsupp(b) nx\in supp(b) \subset \mathbb{R}^n, we say that a unit covector kS(( n) *)k \in S((\mathbb{R}^n)^\ast) is regular if there exists a neighbourhood US(( n) *)U \subset S((\mathbb{R}^n)^\ast) of kk in the unit sphere such that for all ck( n) *c k' \in (\mathbb{R}^n)^\ast with c +c \in \mathbb{R}_+ and kUS(( n) *)k' \in U \subset S((\mathbb{R}^n)^\ast) the decay estimate (115) is valid for the Fourier transform bu^\widehat{b u} of bub u; at ckc k'. Otherwise kk is non-regular. Write

Σ(bu){kS(( n) *)|knon-regular} \Sigma(b u) \;\coloneqq\; \left\{ k \in S((\mathbb{R}^n)^\ast) \;\vert\; k \, \text{non-regular} \right\}

for the set of non-regular covectors of bub u.

The wave front set at xx is the intersection of these sets as bb ranges over bump functions whose support includes xx:

Σ x(u)bC cp ( n)xsupp(b)Σ(bu). \Sigma_x(u) \;\coloneqq\; \underset{ { b \in C^\infty_{cp}(\mathbb{R}^n) } \atop { x \in supp(b) } }{\cap} \Sigma(b u) \,.

Finally the wave front set of uu is the subset of the sphere bundle S(T * n)S(T^\ast \mathbb{R}^n) which over x nx \in \mathbb{R}^n consists of Σ x(U)T x * n\Sigma_x(U) \subset T^\ast_x \mathbb{R}^n:

WF(u)x nΣ x(u)S(T * n) WF(u) \;\coloneqq\; \underset{x \in \mathbb{R}^n}{\cup} \Sigma_x(u) \;\subset\; S(T^\ast \mathbb{R}^n)

Often this is equivalently considered as the full conical set inside the cotangent bundle generated by the unit covectors under multiplication with positive real numbers.

(Hörmander 90, def. 8.1.2)

Remark

(wave front set is the UV divergence-direction-bundle over the singular support)

For u𝒟( n)u \in \mathcal{D}'(\mathbb{R}^n) The Paley-Wiener-Schwartz theorem (prop. 56) implies that

  1. Forgetting the direction covectors in the wave front set WF(u)WF(u) (def. 101) and remembering only the points where they are based yields the set of singlar points of uu, hence the singular support (def. 99)

    WF(u) supp sing(u) n \array{ WF(u) \\ \downarrow \\ supp_{sing}(u) &\hookrightarrow& \mathbb{R}^n }
  2. the wave front set is empty, precisely if the singular support is empty, which is the case precisely if uu is a non-singular distribution.

Example

(wave front set of delta distribution)

Consider the delta distribution

δ 0𝒟( n) \delta_0 \in \mathcal{D}'(\mathbb{R}^n)

given by evaluation at the origin. Its wave front set (def. 101) consists of all the directions at the origin:

WF(δ 0)={(0,k)|k n{0}} n× nT * n. WF(\delta_0) \;=\; \left\{ (0,k) \;\vert\; k \in \mathbb{R}^n \setminus \{0\} \right\} \subset \mathbb{R}^n \times \mathbb{R}^n \simeq T^\ast \mathbb{R}^n \,.
Proof

First of all the singular support (def. 99) of δ 0\delta_0 is clearly supp sing(δ(0))={0}supp_{sing}(\delta(0)) = \{0\}, hence by remark 22 the wave front set vanishes over n{0}\mathbb{R}^n \setminus \{0\}.

At the origin, any bump function bb supported around the origin with b(0)=1b(0) = 1 satisfies bδ(0)=δ(0)b \cdot \delta(0) = \delta(0) and hence the wave front set over the origin is the set of covectors along which the Fourier transform δ^(0)\hat \delta(0) does not suitably decay. But this Fourier transform is in fact a constant function (example 97) and hence does not decay in any direction.

Example

(wave front set of step function)

Let Θ𝒟( 1)\Theta \in \mathcal{D}'(\mathbb{R}^1) be the Heaviside step function given by

Θ(b) 0 b(x)dx. \Theta(b) \coloneqq \int_0^\infty b(x)\, d x \,.

Its wave front set (def. 101) is

WF(H)={(0,k)|k0}. WF(H) = \{(0,k) \vert k \neq 0\} \,.
Proposition

(wave front set of convolution of compactly supported distributions)

Let u,v( n)u,v \in \mathcal{E}'(\mathbb{R}^n) be two compactly supported distributions. Then the wave front set (def. 101) of their convolution of distributions (def. \ref{ConvolutionOfADistributionWithACompactlySupportedDistribution}) is

WF(uv)={(x+y,k)|(x,k)WF(u)and(y,k)WF(u)}. WF(u \star v) \;=\; \left\{ (x + y, k) \;\vert\; (x,k) \in WF(u) \,\text{and}\, (y,k) \in WF(u) \right\} \,.

(Bengel 77, prop. 3.1)

Proposition

(Hörmander's criterion for product of distributions)

Let u,v𝒟( n)u, v \in \mathcal{D}'(\mathbb{R}^n) be two distributions. If their wave front sets (def 101) do not collide, in that for vT x *Xv \in T^\ast_x X a covector contained in one of the two wave front sets then the covector vT x *X-v \in T^\ast_x X with the opposite direction in not contained in the other wave front set, i.e. the intersection fiber product inside the cotangent bundle T *XT^\ast X of the pointwise sum of wave fronts with the zero section is empty:

(WF(u 1)+WF(u 2))×T *XX= \left( WF(u_1) + WF(u_2) \right) \underset{T^\ast X}{\times} X \;=\; \emptyset

i.e.

WF(u 1)+WF(u 2) (pb) X 0 T *X \array{ && \emptyset \\ & \swarrow && \searrow \\ WF(u_1) + WF(u_2) && (pb) && X \\ & \searrow && \swarrow_{\mathrlap{0}} \\ && T^\ast X }

then the product of distributions uvu \cdot v exists, given, locally, by the Fourier inversion of the convolution product of their Fourier transform of distributions.

Definition

(symbol of a differential operator)

Let

D=nND μ 1μ nx μ 1x μ n+D 0 D \;=\; \underset{n \leq N}{\sum} D^{\mu_1 \cdots \mu_n} \frac{\partial}{\partial x^{\mu^1}} \cdots \frac{\partial}{\partial x^{\mu^n}} + D^0

be a differential operator on n\mathbb{R}^n (def. 56). Then its symbol of a differential operator is the smooth function on the cotangent bundle T * n n× nT^\ast \mathbb{R}^n \simeq \mathbb{R}^n \times \mathbb{R}^n (def. 5) given by

T * n q k nND μ 1μ kk μ 1k μ n. \array{ T^\ast \mathbb{R}^n &\overset{q}{\longrightarrow}& \mathbb{C} \\ k &\mapsto& \underset{n \leq N}{\sum} D^{\mu_1 \cdots \mu_k} k_{\mu_1} \cdots k_{\mu_n} } \,.

The principal symbol is the top degree homogeneous part D μ 1μ kk μ 1k μ ND^{\mu_1 \cdots \mu_k} k_{\mu_1} \cdots k_{\mu_N}.

Definition

(symbol order)

A smooth function qq on the cotangent bundle T * nT^\ast \mathbb{R}^n (e.g. the symbol of a differential operator, def. 102 ) is of order mm (and type 1,01,0, denoted qS m=S 1,0 mq \in S^m = S^m_{1,0}), for mm \in \mathbb{N}, if on each coordinate chart ((x i),(k i))((x^i), (k_i)) we have that for every compact subset KK of the base space and all multi-indices α\alpha and β\beta, there is a real number C α,β,KC_{\alpha, \beta,K } \in \mathbb{R} such that the absolute value of the partial derivatives of qq is bounded by

| αk α βx βq(x,k)|C α,β,K(1+|k|) m|α| \left\vert \frac{\partial^\alpha}{\partial k_\alpha} \frac{\partial^\beta}{\partial x^\beta} q(x,k) \right\vert \;\leq\; C_{\alpha,\beta,K}\left( 1+ {\vert k\vert}\right)^{m - {\vert \alpha\vert}}

for all xKx \in K and all cotangent vectors kk to xx.

A Fourier integral operator QQ is of symbol class L m=L 1,0 mL^m = L^m_{1,0} if it is of the form

Qf(x)=e ik(xy)q(x,y,k)f(y)dydk Q f (x) \;=\; \int \int e^{i k \cdot (x - y)} q(x,y,k) f(y) \, d y \, d k

with symbol qq of order mm, in the above sense.

(Hörmander 71, def. 1.1.1 and first sentence of section 2.1 with (1.4.1))

Proposition

(propagation of singularities theorem)

Let QQ be a pseudo-differential operator on some smooth manifold XX which is properly supported (def. \ref{ProperlySupportedPseudoDifferentialOperator}) and of symbol class L mL^m (def. 103) with real principal symbol qq that is homogeneous of degree mm.

For u𝒟(X)u \in \mathcal{D}'(X) a distribution with Qu=fQ u = f, then the complement of the wave front set of uu by that of ff is contained in the set of covectors on which the principal symbol qq vanishes:

WF(u)WF(f)q 1(0). WF(u) \setminus WF(f) \;\subset\; q^{-1}(0) \,.

Moreover, WF(u)WF(u) is invariant under the bicharacteristic flow induced by the Hamiltonian vector field of qq with respect to the canonical symplectic manifold structure on the cotangent bundle (here).

(Duistermaat-Hörmander 72, theorem 6.1.1, recalled for instance as Radzikowski 96, theorem 4.6)

\,

Cauchy principal value

An important application of the Fourier analysis of distributions is the class of distributions known broadly as Cauchy principal values. Below we will find that these control the detailed nature of the various propagators of free field theories, notably the Feynman propagator is manifestly a Cauchy principal value (prop. 71 and def. 110 below), but also the singular support properties of the causal propagator and the Hadamard propagator are governed by Cauchy principal values (prop. 72 and prop. 73 below). This way the understanding of Cauchy principal values eventually allows us to determine the wave front set of all the propagators (prop. 75) below.

Therefore we now collect some basic definitions and facts on Cauchy principal values.

The Cauchy principal value of a function which is integrable on the complement of one point is, if it exists, the limit of the integrals of the function over subsets in the complement of this point as these integration domains tend to that point symmetrically from all sides.

One also subsumes the case that the “point” is “at infinity”, hence that the function is integrable over every bounded domain. In this case the Cauchy principal value is the limit, if it exists, of the integrals of the function over bounded domains, as their bounds tend symmetrically to infinity.

The operation of sending a compactly supported smooth function (bump function) to Cauchy principal value of its pointwise product with a function ff that may be singular at the origin defines a distribution, usually denoted PV(f)PV(f).

Definition

(Cauchy principal value of an integral over the real line)

Let f:f \colon \mathbb{R} \to \mathbb{R} be a function on the real line such that for every positive real number ϵ\epsilon its restriction to (ϵ,ϵ)\mathbb{R}\setminus (-\epsilon, \epsilon) is integrable. Then the Cauchy principal value of ff is, if it exists, the limit

PV(f)limϵ0(ϵ,ϵ)f(x)dx. PV(f) \coloneqq \underset{\epsilon \to 0}{\lim} \underset{\mathbb{R} \setminus (-\epsilon, \epsilon)}{\int} f(x) \, d x \,.
Definition

(Cauchy principal value as distribution on the real line)

Let f:f \colon \mathbb{R} \to \mathbb{R} be a function on the real line such that for all bump functions bC cp ()b \in C^\infty_{cp}(\mathbb{R}) the Cauchy principal value of the pointwise product function fbf b exists, in the sense of def. 104. Then this assignment

PV(f):bPV(fb) PV(f) \;\colon\; b \mapsto PV(f b)

defines a distribution PV(f)𝒟()PV(f) \in \mathcal{D}'(\mathbb{R}).

Example

Let f:f \colon \mathbb{R} \to \mathbb{R} be an integrable function which is symmetric, in that f(x)=f(x)f(-x) = f(x) for all xx \in \mathbb{R}. Then the principal value integral (def. 104) of xf(x)xx \mapsto \frac{f(x)}{x} exists and is zero:

limϵ0(ϵ,ϵ)f(x)xdx=0 \underset{\epsilon \to 0}{\lim} \underset{\mathbb{R}\setminus (-\epsilon, \epsilon)}{\int} \frac{f(x)}{x} d x \; = \; 0

This is because, by the symmetry of ff and the skew-symmetry of x1/xx \mapsto 1/x, the the two contributions to the integral are equal up to a sign:

ϵf(x)xdx= ϵ f(x)xdx. \int_{-\infty}^{-\epsilon} \frac{f(x)}{x} d x \;=\; - \int_{\epsilon}^\infty \frac{f(x)}{x} d x \,.
Example

The Cauchy principal value distribution PV(1x)PV\left( \frac{1}{x}\right) (def. 105) solves the distributional equation

(116)xPV(1x)=1AAA𝒟( 1). x PV\left(\frac{1}{x}\right) = 1 \phantom{AAA} \in \mathcal{D}'(\mathbb{R}^1) \,.

Since the delta distribution δ𝒟( 1)\delta \in \mathcal{D}'(\mathbb{R}^1) solves the equation

xδ(x)=0AAA𝒟(𝕣 1) x \delta(x) = 0 \phantom{AAA} \in \mathcal{D}'(\mathbb{r}^1)

we have that more generally every linear combination of the form

(117)F(x)PV(1/x)+cδ(x)AAA𝒟( 1) F(x) \coloneqq PV(1/x) + c \delta(x) \phantom{AAA} \in \mathcal{D}'(\mathbb{R}^1)

for cc \in \mathbb{C}, is a distributional solution to xF(x)=1x F(x) = 1.

The wave front set of all these solutions is

WF(PV(1/x)+cδ(x))={(0,k)|k *{0}}. WF\left( PV(1/x) + c \delta(x) \right) \;=\; \left\{ (0,k) \;\vert\; k \in \mathbb{R}^\ast \setminus \{0\} \right\} \,.
Proof

The first statement is immediate from the definition: For bC c ( 1)b \in C^\infty_c(\mathbb{R}^1) any bump function we have that

xPV(1x),b limϵ0 1(ϵ,ϵ)xxb(x)dx =b(x)dx =1,b \begin{aligned} \left\langle x PV\left(\frac{1}{x}\right), b \right\rangle & \coloneqq \underset{\epsilon \to 0}{\lim} \underset{\mathbb{R}^1 \setminus (-\epsilon, \epsilon)}{\int} \frac{x}{x}b(x) \, d x \\ & = \int b(x) d x \\ & = \langle 1,b\rangle \end{aligned}

Regarding the second statement: It is clear that the wave front set is concentrated at the origin. By symmetry of the distribution around the origin, it must contain both directions.

Proposition

In fact (117) is the most general distributional solution to (116).

This follows by the characterization of extension of distributions to a point, see there at this prop. (Hörmander 90, thm. 3.2.4)

Definition

(integration against inverse variable with imaginary offset)

Write

1x+i0 ±𝒟() \tfrac{1}{x + i0^\pm} \;\in\; \mathcal{D}'(\mathbb{R})

for the distribution which is the limit in 𝒟()\mathcal{D}'(\mathbb{R}) of the non-singular distributions which are given by the smooth functions x1x±iϵx \mapsto \tfrac{1}{x \pm i \epsilon} as the positive real number ϵ\epsilon tends to zero:

1x+i0 ±limϵ(0,)ϵ01x±iϵ \frac{1}{ x + i 0^\pm } \;\coloneqq\; \underset{ { \epsilon \in (0,\infty) } \atop { \epsilon \to 0 } }{\lim} \tfrac{1}{x \pm i \epsilon}

hence the distribution which sends bC ( 1)b \in C^\infty(\mathbb{R}^1) to

bb(x)x±iϵdx. b \mapsto \underset{\mathbb{R}}{\int} \frac{b(x)}{x \pm i \epsilon} \, d x \,.
Proposition

(Cauchy principal value equals integration with imaginary offset plus delta distribution)

The Cauchy principal value distribution PV(1x)𝒟()PV\left( \tfrac{1}{x}\right) \in \mathcal{D}'(\mathbb{R}) (def. 105) is equal to the sum of the integration over 1/x1/x with imaginary offset (def. 106) and a delta distribution.

PV(1x)=1x+i0 ±±iπδ. PV\left(\frac{1}{x}\right) \;=\; \frac{1}{x + i 0^\pm} \pm i \pi \delta \,.

In particular, by prop. 88 this means that 1x+i0 ±\tfrac{1}{x + i 0^\pm} solves the distributional equation

x1x+i0 ±=1AA𝒟( 1). x \frac{1}{x + i 0^\pm} \;=\; 1 \phantom{AA} \in \mathcal{D}'(\mathbb{R}^1) \,.
Proof

Using that

1x±iϵ =xiϵ(x+iϵ)(xiϵ) =xiϵ(x 2+ϵ 2) \begin{aligned} \frac{1}{x \pm i \epsilon} & = \frac{ x \mp i \epsilon }{ (x + i \epsilon)(x - i \epsilon) } \\ & = \frac{ x \mp i \epsilon }{(x^2 + \epsilon^2)} \end{aligned}

we have for every bump function bC cp ( 1)b \in C^\infty_{cp}(\mathbb{R}^1)

limϵ0 1b(x)x±iϵdx =limϵ0 1x 2x 2+ϵ 2b(x)xdx(A)iπlimϵ0 11πϵx 2+ϵ 2b(x)dx(B) \begin{aligned} \underset{\epsilon \to 0}{\lim} \underset{\mathbb{R}^1}{\int} \frac{b(x)}{x \pm i \epsilon} d x & \;=\; \underset{ (A) }{ \underbrace{ \underset{\epsilon \to 0}{\lim} \underset{\mathbb{R}^1}{\int} \frac{x^2}{x^2 + \epsilon^2} \frac{b(x)}{x} d x } } \mp i \pi \underset{(B)}{ \underbrace{ \underset{\epsilon \to 0}{\lim} \underset{\mathbb{R}^1}{\int} \frac{1}{\pi} \frac{\epsilon}{x^2 + \epsilon^2} b(x) \, d x }} \end{aligned}

Since

x 2x 2+ϵ 2 |x|<ϵϵ0 |x|>ϵϵ0 0 1 \array{ && \frac{x^2}{x^2 + \epsilon^2} \\ & {}^{\mathllap{ { {\vert x \vert} \lt \epsilon } \atop { \epsilon \to 0 } }}\swarrow && \searrow^{\mathrlap{ {{\vert x\vert} \gt \epsilon} \atop { \epsilon \to 0 } }} \\ 0 && && 1 }

it is plausible that (A)=PV(b(x)x)(A) = PV\left( \frac{b(x)}{x} \right), and similarly that (B)=b(0)(B) = b(0). In detail:

(A) =limϵ0 1xx 2+ϵ 2b(x)dx =limϵ0 1ddx(12ln(x 2+ϵ 2))b(x)dx =12limϵ0 1ln(x 2+ϵ 2)dbdx(x)dx =12 1ln(x 2)dbdx(x)dx = 1ln(|x|)dbdx(x)dx =limϵ0 1(ϵ,ϵ)ln(|x|)dbdx(x)dx =limϵ0 1(ϵ,ϵ)1xb(x)dx =PV(b(x)x) \begin{aligned} (A) & = \underset{\epsilon \to 0}{\lim} \underset{\mathbb{R}^1}{\int} \frac{x}{x^2 + \epsilon^2} b(x) d x \\ & = \underset{\epsilon \to 0}{\lim} \underset{\mathbb{R}^1}{\int} \frac{d}{d x} \left( \tfrac{1}{2} \ln(x^2 + \epsilon^2) \right) b(x) d x \\ & = -\tfrac{1}{2} \underset{\epsilon \to 0}{\lim} \underset{\mathbb{R}^1}{\int} \ln(x^2 + \epsilon^2) \frac{d b}{d x}(x) d x \\ & = -\tfrac{1}{2} \underset{\mathbb{R}^1}{\int} \ln(x^2) \frac{d b}{d x}(x) d x \\ & = - \underset{\mathbb{R}^1}{\int} \ln({\vert x \vert}) \frac{d b}{d x}(x) d x \\ & = - \underset{\epsilon \to 0}{\lim} \underset{\mathbb{R}^1\setminus (-\epsilon, \epsilon)}{\int} \ln( {\vert x \vert} ) \frac{d b}{d x}(x) d x \\ & = \underset{\epsilon \to 0}{\lim} \underset{\mathbb{R}^1\setminus (-\epsilon, \epsilon)}{\int} \frac{1}{x} b(x) d x \\ & = PV\left( \frac{b(x)}{x} \right) \end{aligned}

and

(B) =1πlimϵ0 1ϵx 2+ϵ 2b(x)dx =1πlimϵ0 1(ddxarctan(xϵ))b(x)dx =1πlimϵ0 1arctan(xϵ)dbdx(x)dx =12 1sgn(x)dbdx(x)dx =b(0) \begin{aligned} (B) & = \tfrac{1}{\pi} \underset{\epsilon \to 0}{\lim} \underset{\mathbb{R}^1}{\int} \frac{\epsilon}{x^2 + \epsilon^2} b(x) \, d x \\ & = \tfrac{1}{\pi} \underset{\epsilon \to 0}{\lim} \underset{\mathbb{R}^1}{\int} \left( \frac{d}{d x} \arctan\left( \frac{x}{\epsilon} \right) \right) b(x) \, d x \\ & = - \tfrac{1}{\pi} \underset{\epsilon \to 0}{\lim} \underset{\mathbb{R}^1}{\int} \arctan\left( \frac{x}{\epsilon} \right) \frac{d b}{d x}(x) \, d x \\ & = - \frac{1}{2} \underset{\mathbb{R}^1}{\int} sgn(x) \frac{d b}{d x}(x) \, d x \\ & = b(0) \end{aligned}

where we used that the derivative of the arctan function is ddxarctan(x)=1/(1+x 2)\frac{d}{ d x} \arctan(x) = 1/(1 + x^2) and that limϵ+arctan(x/ϵ)=π2sgn(x)\underset{\epsilon \to + \infty}{\lim} \arctan(x/\epsilon) = \tfrac{\pi}{2}sgn(x) is proportional to the sign function.

Example

(Fourier integral formula for step function)

The Heaviside distribution Θ𝒟()\Theta \in \mathcal{D}'(\mathbb{R}) is equivalently the following Cauchy principal value (def. 105):

Θ(x) =12πi e iωxωi0 + limϵ0 +12πi e iωxωiϵdω, \begin{aligned} \Theta(x) & = \frac{1}{2\pi i} \int_{-\infty}^\infty \frac{e^{i \omega x}}{\omega - i 0^+} \\ & \coloneqq \underset{ \epsilon \to 0^+}{\lim} \frac{1}{2 \pi i} \int_{-\infty}^\infty \frac{e^{i \omega x}}{\omega - i \epsilon} d\omega \,, \end{aligned}

where the limit is taken over sequences of positive real numbers ϵ(,0)\epsilon \in (-\infty,0) tending to zero.

Proof

We may think of the integrand e iωxωiϵ\frac{e^{i \omega x}}{\omega - i \epsilon} uniquely extended to a holomorphic function on the complex plane and consider computing the given real line integral for fixed ϵ\epsilon as a contour integral in the complex plane.

If x(0,)x \in (0,\infty) is positive, then the exponent

iωx=Im(ω)x+iRe(ω)x i \omega x = - Im(\omega) x + i Re(\omega) x

has negative real part for positive imaginary part of ω\omega. This means that the line integral equals the complex contour integral over a contour C +C_+ \subset \mathbb{C} closing in the upper half plane. Since iϵi \epsilon has positive imaginary part by construction, this contour does encircle the pole of the integrand e iωxωiϵ\frac{e^{i \omega x}}{\omega - i \epsilon} at ω=iϵ\omega = i \epsilon. Hence by the Cauchy integral formula in the case x>0x \gt 0 one gets

limϵ0 +12πi e iωxωiϵdω =limϵ0 +12πi C +e iωxωiϵdω =limϵ0 +(e iωx| ω=iϵ) =limϵ0 +e ϵx =e 0=1. \begin{aligned} \underset{\epsilon \to 0^+}{\lim} \frac{1}{2 \pi i} \int_{-\infty}^\infty \frac{e^{i \omega x}}{\omega - i \epsilon} d\omega & = \underset{\epsilon \to 0^+}{\lim} \frac{1}{2 \pi i} \oint_{C_+} \frac{e^{i \omega x}}{\omega - i \epsilon} d \omega \\ & = \underset{\epsilon \to 0^+}{\lim} \left(e^{i \omega x}\vert_{\omega = i \epsilon}\right) \\ & = \underset{\epsilon \to 0^+}{\lim} e^{- \epsilon x} \\ & = e^0 = 1 \end{aligned} \,.

Conversely, for x<0x \lt 0 the real part of the integrand decays as the negative imaginary part increases, and hence in this case the given line integral equals the contour integral for a contour C C_- \subset \mathbb{C} closing in the lower half plane. Since the integrand has no pole in the lower half plane, in this case the Cauchy integral formula says that this integral is zero.

Conversely, by the Fourier inversion theorem, the Fourier transform of the Heaviside distribution is the Cauchy principal value as in prop. 61:

Example

(relation to Fourier transform of Heaviside distribution / Schwinger parameterization)

The Fourier transform of distributions (def. 96) of the Heaviside distribution is the following Cauchy principal value:

Θ^(x) = 0 e ikxdk =i1x+i0 + \begin{aligned} \widehat \Theta(x) & = \int_0^\infty e^{i k x} \, dk \\ & = i \frac{1}{x + i 0^+} \end{aligned}

Here the second equality is also known as complex Schwinger parameterization.

Proof

As generalized functions consider the limit with a decaying component:

0 e ikxdk =limϵ0 + 0 e ikxϵkdk =limϵ0 +1ixϵ =i1x+i0 + \begin{aligned} \int_0^\infty e^{i k x} \, dk & = \underset{\epsilon \to 0^+}{\lim} \int_0^\infty e^{i k x - \epsilon k} \, dk \\ & = - \underset{\epsilon \to 0^+}{\lim} \frac{1}{ i x - \epsilon} \\ & = i \frac{1}{x + i 0^+} \end{aligned}

Let now q: nq \colon \mathbb{R}^{n} \to \mathbb{R} be a non-degenerate real quadratic form analytically continued to a real quadratic form

q: n. q \;\colon\; \mathbb{C}^n \longrightarrow \mathbb{C} \,.

Write Δ\Delta for the determinant of qq

Write q *q^\ast for the induced quadratic form on dual vector space. Notice that qq (and hence a *a^\ast) are assumed non-degenerate but need not necessarily be positive or negative definite.

Proposition

(Fourier transform of principal value of power of quadratic form)

Let mm \in \mathbb{R} be any real number, and κ\kappa \in \mathbb{C} any complex number. Then the Fourier transform of distributions of 1/(q+m 2+i0 +) κ1/(q + m^2 + i 0^+)^\kappa is

(1q+m 2+i0 +)^=2 1κ(2π) nm n/2κΓ(κ)ΔK n/2κ(mq *i0 +)(q *i0 +) n/2κ, \widehat { \left( \frac{1}{q + m^2 + i0^+} \right) } \;=\; \frac{ 2^{1- \kappa} (\sqrt{2\pi})^{n} m^{n/2-\kappa} } { \Gamma(\kappa) \sqrt{\Delta} } \frac{ K_{n/2 - \kappa}\left( m \sqrt{q^\ast - i 0^+} \right) } { \left(\sqrt{q^\ast - i0^+ }\right)^{n/2 - \kappa} } \,,

where

  1. Γ\Gamma deotes the Gamma function

  2. K νK_{\nu} denotes the modified Bessel function.

Notice that K ν(a)K_\nu(a) diverges for a0a \to 0 as a νa^{-\nu} (DLMF 10.30.2).

(Gel’fand-Shilov 66, III 2.8 (8) and (9), p 289)

Proposition

(Fourier transform of delta distribution applied to mass shell)

Let mm \in \mathbb{R}, then the Fourier transform of distributions of the delta distribution δ\delta applied to the “mass shell” q+m 2q + m^2 is

δ(q+m 2)^=i|Δ|(e iπt/2K n/21(mq *+i0 +)(q *+i0 +) n/21e iπt/2K n/21(mq *i0 +)(q *i0 +) n/21), \widehat{ \delta(q + m^2) } \;=\; - \frac{i}{\sqrt{{\vert\Delta\vert}}} \left( e^{i \pi t /2 } \frac{ K_{n/2-1} \left( m \sqrt{ q^\ast + i0^+ } \right) }{ \left(\sqrt{q^\ast + i0^+}\right)^{n/2 - 1} } \;-\; e^{-i \pi t /2 } \frac{ K_{n/2-1} \left( m \sqrt{ q^\ast - i0^+ } \right) }{ \left(\sqrt{q^\ast - i0^+}\right)^{n/2 - 1} } \right) \,,

where K νK_\nu denotes the modified Bessel function of order ν\nu.

Notice that K ν(a)K_\nu(a) diverges for a0a \to 0 as a νa^{-\nu} (DLMF 10.30.2).

(Gel’fand-Shilov 66, III 2.11 (7), p 294)

\,

propagators for the free scalar field on Minkowski spacetime

  1. Advanced and regarded propagators

  2. Causal propagator

  3. Hadamard propagator

  4. Feynman propagator

  5. Singular support and Wave front sets

\,

On Minkowski spacetime p,1\mathbb{R}^{p,1} consider the Klein-Gordon operator (example 25)

η μνx μx νΦ(mc) 2Φ=0. \eta^{\mu \nu} \frac{\partial}{\partial x^\mu} \frac{\partial}{\partial x^\nu} \Phi - \left( \tfrac{m c}{\hbar} \right)^2 \Phi \;=\; 0 \,.

By example 83 its Fourier transform is

k μk μ(mc) 2=(k 0) 2|k| 2(mc) 2. - k_\mu k^\mu - \left( \tfrac{m c}{\hbar} \right)^2 \;=\; (k_0)^2 - {\vert \vec k\vert}^2 - \left( \tfrac{m c}{\hbar} \right)^2 \,.

The dispersion relation of this equation we write (see def. 91)

(118)ω(k)+c|k| 2+(mc) 2, \omega(\vec k) \;\coloneqq\; + c \sqrt{ {\vert \vec k \vert}^2 + \left( \tfrac{m c}{\hbar}\right)^2 } \,,

where on the right we choose the non-negative square root.

\,

advanced and retarded propagators for Klein-Gordon equation on Minkowski spacetime

Proposition

(mode expansion of advanced and retarded propagators for Klein-Gordon operator on Minkowski spacetime)

The advanced and retarded Green functions G ±G_\pm (def. 78) of the Klein-Gordon operator on Minkowski spacetime (example 25) are induced from integral kernels (“propagators”), hence distributions in two variables

Δ ±𝒟( p,1× p,1) \Delta_\pm \in \mathcal{D}'(\mathbb{R}^{p,1}\times \mathbb{R}^{p,1})

by (in generalized function-notation, prop. 38)

G ±(Φ)= p,1Δ ±(x,y)Φ(y)dvol(y) G_\pm(\Phi) \;=\; \underset{\mathbb{R}^{p,1}}{\int} \Delta_{\pm}(x,y) \Phi(y) \, dvol(y)

where the advanced and retarded propagators Δ ±(x,y)\Delta_{\pm}(x,y) have the following equivalent expressions:

(119)Δ ±(xy) =1(2π) p+1limϵ(0,)ϵ0e ik 0(x 0y 0)e ik(xy)(k 0iϵ) 2|k| 2(mc) 2dk 0d pk ={±i(2π) p12ω(k)/c(e +iω(k)(x 0y 0)/c+ik(xy)e iω(k)(x 0y 0)/c+ik(xy))d pk | if±(x 0y 0)>0 0 | otherwise ={1(2π) p1ω(k)/csin(ω(k)(x 0y 0)/c)e ik(xy)d pk | if±(x 0y 0)>0 0 | otherwise \begin{aligned} \Delta_\pm(x-y) & = \frac{1}{(2\pi)^{p+1}} \underset{ {\epsilon \in (0,\infty)} \atop {\epsilon \to 0} }{\lim} \int \int \frac{ e^{i k_0 (x^0 - y^0)} e^{i \vec k \cdot (\vec x - \vec y)} }{ (k_0 \mp i\epsilon)^2 - {\vert \vec k\vert}^2 -\left( \tfrac{m c}{\hbar}\right)^2 } \, d k_0 \, d^p \vec k \\ & = \left\{ \array{ \frac{\pm i}{(2\pi)^{p}} \int \frac{1}{2\omega(\vec k)/c} \left( e^{+i \omega(\vec k)(x^0 - y^0)/c + i \vec k \cdot (\vec x -\vec y)} - e^{-i \omega(\vec k)(x^0 - y^0)/c + i \vec k \cdot (\vec x - \vec y) } \right) d^p \vec k & \vert & \text{if} \, \pm (x^0 - y^0) \gt 0 \\ 0 & \vert & \text{otherwise} } \right. \\ & = \left\{ \array{ \frac{\mp 1}{(2\pi)^{p}} \int \frac{1}{\omega(\vec k)/c} \sin\left( \omega(\vec k)(x^0 - y^0)/c \right) e^{i \vec k \cdot (\vec x - \vec y) } d^p \vec k & \vert & \text{if} \, \pm (x^0 - y^0) \gt 0 \\ 0 & \vert & \text{otherwise} } \right. \end{aligned}

Here ω(k)\omega(\vec k) denotes the dispersion relation (118) of the Klein-Gordon equation.

Proof

The Klein-Gordon operator is a Green hyperbolic differential operator (example 63) therefore its advanced and retarded Green functions exist uniquely (prop. 41). Moreover, prop. 42 says that they are continuous linear functionals with respect to the topological vector space structures on spaces of smooth sections (def. 73). In the case of the Klein-Gordon operator this just means that

G ±:C cp ( p,1)C ±cp ( p,1) G_{\pm} \;\colon\; C^\infty_{cp}(\mathbb{R}^{p,1}) \longrightarrow C^\infty_{\pm cp}(\mathbb{R}^{p,1})

are continuous linear functionals in the standard sense of distributions. Therefore the Schwartz kernel theorem implies the existence of integral kernels being distributions in two variables

Δ ±𝒟( p,1× p,1) \Delta_{\pm} \in \mathcal{D}(\mathbb{R}^{p,1} \times \mathbb{R}^{p,1})

such that, in the notation of generalized functions,

(G ±α)(x)= p,1Δ ±(x,y)α(y)dvol(y). (G_\pm \alpha)(x) \;=\; \underset{\mathbb{R}^{p,1}}{\int} \Delta_{\pm}(x,y) \alpha(y) \, dvol(y) \,.

These integral kernels are the advanced/retarded “propagators”. We now compute these integral kernels by making an Ansatz and showing that it has the defining properties, which identifies them by the uniqueness statement of prop. 41.

We make use of the fact that the Klein-Gordon equation is invariant under the defnining action of the Poincaré group on Minkowski spacetime, which is a semidirect product group of the translation group and the Lorentz group.

Since the Klein-Gordon operator is invariant, in particular, under translations in p,1\mathbb{R}^{p,1} it is clear that the propagators, as a distribution in two variables, depend only on the difference of its two arguments

(120)Δ ±(x,y)=Δ ±(xy). \Delta_{\pm}(x,y) = \Delta_{\pm}(x-y) \,.

Since moreover the Klein-Gordon operator is formally self-adjoint (this prop.) this implies that for PP the Klein the equation (82)

PG ±=id P \circ G_\pm = id

is equivalent to the equation (81)

G ±P=id. G_\pm \circ P = id \,.

Therefore it is sufficient to solve for the first of these two equation, subject to the defining support conditions. In terms of the propagator integral kernels this means that we have to solve the distributional equation

(121)(η μνx μx ν(mc) 2)Δ ±(xy)=δ(xy) \left( \eta^{\mu \nu} \frac{\partial}{\partial x^\mu} \frac{\partial}{\partial x^\nu} - \left( \tfrac{m c}{\hbar} \right)^2 \right) \Delta_\pm(x-y) \;=\; \delta(x-y)

subject to the condition that the distributional support (def. 74) is

supp(Δ ±(xy)){|xy| η 2<0,±(x 0y 0)>0}. supp\left( \Delta_{\pm}(x-y) \right) \subset \left\{ {\vert x-y\vert^2_\eta}\lt 0 \;\,,\; \pm(x^0 - y^ 0) \gt 0 \right\} \,.

We make the Ansatz that we assume that Δ ±\Delta_{\pm}, as a distribution in a single variable xyx-y, is a tempered distribution

Δ ±𝒮( p,1), \Delta_\pm \in \mathcal{S}'(\mathbb{R}^{p,1}) \,,

hence amenable to Fourier transform of distributions (def. 96). If we do find a solution this way, it is guaranteed to be the unique solution by prop. 41.

By example 82 the distributional Fourier transform of equation (121) is

(122)(η μνk μk ν(mc) 2)Δ ±^(k) =δ^(k) =1, \begin{aligned} \left( - \eta^{\mu \nu} k_\mu k_\nu - \left( \tfrac{m c}{\hbar} \right)^2 \right) \widehat{\Delta_{\pm}}(k) & = \widehat{\delta}(k) \\ & = 1 \end{aligned} \,,

where in the second line we used the Fourier transform of the delta distribution from example 97.

Notice that this implies that the Fourier transform of the causal propagator

Δ SΔ +Δ \Delta_S \coloneqq \Delta_+ - \Delta_-

satisfies the homogeneous equation:

(123)(η μνk μk ν(mc) 2)Δ S^(k)=0, \left( - \eta^{\mu \nu} k_\mu k_\nu - \left( \tfrac{m c}{\hbar} \right)^2 \right) \widehat{\Delta_S}(k) \;=\; 0 \,,

Hence we are now reduced to finding solutions Δ ±^𝒮( p,1)\widehat{\Delta_\pm} \in \mathcal{S}'(\mathbb{R}^{p,1}) to (122) such that their Fourier inverse Δ ±\Delta_\pm has the required support properties.

We discuss this by a variant of the Cauchy principal value:

Suppose the following limit of non-singular distributions in the variable k p,1k \in \mathbb{R}^{p,1} exists in the space of distributions

(124)limϵ(0,)ϵ01(k 0iϵ) 2|k| 2(mc) 2𝒟( p,1) \underset{ {\epsilon \in (0,\infty)} \atop { \epsilon \to 0 } }{\lim} \frac{1}{ (k_0 \mp i \epsilon)^2 - {\vert \vec k\vert^2} - \left( \tfrac{m c}{\hbar} \right)^2 } \;\in\; \mathcal{D}'(\mathbb{R}^{p,1})

meaning that for each bump function bC cp ( p,1)b \in C^\infty_{cp}(\mathbb{R}^{p,1}) the limit in \mathbb{C}

limϵ(0,)ϵ0 p,1b(k)(k 0iϵ) 2|k| 2(mc) 2d p+1k \underset{ {\epsilon \in (0,\infty)} \atop { \epsilon \to 0 } }{\lim} \underset{\mathbb{R}^{p,1}}{\int} \frac{b(k)}{ (k_0\mp i \epsilon)^2 - {\vert \vec k\vert}^2 - \left( \tfrac{m c}{\hbar} \right)^2 } d^{p+1}k \;\in\; \mathbb{C}

exists. Then this limit is clearly a solution to the distributional equation (122) because on those bump functions b(k)b(k) which happen to be products with (η μνk μkν(mc) 2)\left(-\eta^{\mu \nu}k_\mu k-\nu - \left( \tfrac{m c}{\hbar}\right)^2\right) we clearly have

limϵ(0,)ϵ0 p,1(η μνk μk ν(mc) 2)b(k)(k 0iϵ) 2|k| 2(mc) 2d p+1k = p,1limϵ(0,)ϵ0(η μνk μk ν(mc) 2)(k 0iϵ) 2|k| 2(mc) 2=1b(k)d p+1k =1,b. \begin{aligned} \underset{ {\epsilon \in (0,\infty)} \atop { \epsilon \to 0 } }{\lim} \underset{\mathbb{R}^{p,1}}{\int} \frac{ \left( -\eta^{\mu \nu} k_\mu k_\nu - \left( \tfrac{m c}{\hbar} \right)^2 \right) b(k) }{ (k_0\mp i \epsilon)^2 - {\vert \vec k\vert}^2 - \left( \tfrac{m c}{\hbar} \right)^2 } d^{p+1}k & = \underset{\mathbb{R}^{p,1}}{\int} \underset{= 1}{ \underbrace{ \underset{ {\epsilon \in (0,\infty)} \atop { \epsilon \to 0 } }{\lim} \frac{ \left( -\eta^{\mu \nu} k_\mu k_\nu - \left( \tfrac{m c}{\hbar} \right)^2 \right) }{ (k_0\mp i \epsilon)^2 - {\vert \vec k\vert}^2 - \left( \tfrac{m c}{\hbar} \right)^2 } } } b(k)\, d^{p+1}k \\ & = \langle 1, b\rangle \,. \end{aligned}

Moreover, if the limiting distribution (124) exists, then it is clearly a tempered distribution, hence we may apply Fourier inversion to obtain Green functions

(125)Δ ±(x,y)limϵ(0,)ϵ01(2π) p+1 p,1e ik μ(xy) μ(k 0iϵ) 2|k| 2(mc) 2dk 0d pk. \Delta_{\pm}(x,y) \;\coloneqq\; \underset{ {\epsilon \in (0,\infty)} \atop {\epsilon \to 0} }{\lim} \frac{1}{(2\pi)^{p+1}} \underset{\mathbb{R}^{p,1}}{\int} \frac{e^{i k_\mu (x-y)^\mu}}{ (k_0 \mp i \epsilon )^2 - {\vert \vec k\vert}^2 - \left(\tfrac{m c}{\hbar}\right)^2 } d k_0 d^p \vec k \,.

To see that this is the correct answer, we need to check the defining support property.

Finally, by the Fourier inversion theorem, to show that the limit (124) indeed exists it is sufficient to show that the limit in (125) exists.

We compute as follows

(126)Δ ±(xy) =1(2π) p+1limϵ(0,)ϵ0e ik 0(x 0y 0)e ik(xy)(k 0iϵ) 2|k| 2(mc) 2dk 0d pk =1(2π) p+1limϵ(0,)ϵ0e ik 0(x 0y 0)e ik(xy)(k 0iϵ) 2(ω(k)/c) 2dk 0d pk =1(2π) p+1limϵ(0,)ϵ0e ik 0(x 0y 0)e ik(xy)((k 0iϵ)ω(k)/c)((k 0iϵ)+ω(k)/c)dk 0d pk ={±i(2π) p12ω(k)/c(e iω(k)(x 0y 0)/c+ik(xy)e iω(k)(x 0y 0)/c+ik(xy))d pk | if±(x 0y 0)>0 0 | otherwise ={1(2π) p1ω(k)/csin(ω(k)(x 0y 0)/c)e ik(xy)d pk | if±(x 0y 0)>0 0 | otherwise \begin{aligned} \Delta_\pm(x-y) & = \frac{1}{(2\pi)^{p+1}} \underset{ {\epsilon \in (0,\infty)} \atop {\epsilon \to 0} }{\lim} \int \int \frac{ e^{i k_0 (x^0 - y^0)} e^{i \vec k \cdot (\vec x - \vec y)} }{ (k_0 \mp i\epsilon)^2 - {\vert \vec k\vert}^2 -\left( \tfrac{m c}{\hbar}\right)^2 } \, d k_0 \, d^p \vec k \\ & = \frac{1}{(2\pi)^{p+1}} \underset{ {\epsilon \in (0,\infty)} \atop {\epsilon \to 0} }{\lim} \int \int \frac{ e^{i k_0 (x^0 - y^0)} e^{i \vec k \cdot (\vec x - \vec y)} }{ (k_0 \mp i \epsilon)^2 - \left(\omega(\vec k)/c\right)^2 } \, d k_0 \, d^p \vec k \\ &= \frac{1}{(2\pi)^{p+1}} \underset{ {\epsilon \in (0,\infty)} \atop {\epsilon \to 0} }{\lim} \int \int \frac{ e^{i k_0 (x^0 - y^0)} e^{i \vec k \cdot (\vec x - \vec y)} }{ \left( (k_0 \mp i\epsilon) - \omega(\vec k)/c \right) \left( (k_0 \mp i \epsilon) + \omega(\vec k)/c \right) } \, d k_0 \, d^p \vec k \\ & = \left\{ \array{ \frac{\pm i}{(2\pi)^{p}} \int \frac{1}{2\omega(\vec k)/c} \left( e^{i \omega(\vec k)(x^0 - y^0)/c + i \vec k \cdot (\vec x -\vec y)} - e^{-i \omega(\vec k)(x^0 - y^0)/c + i \vec k \cdot (\vec x - \vec y)} \right) d^p \vec k & \vert & \text{if} \, \pm (x^0 - y^0) \gt 0 \\ 0 & \vert & \text{otherwise} } \right. \\ & = \left\{ \array{ \frac{\mp 1}{(2\pi)^{p}} \int \frac{1}{\omega(\vec k)/c} \sin\left( \omega(\vec k)(x^0 - y^0)/c \right) e^{i \vec k \cdot (\vec x - \vec y) } d^p \vec k & \vert & \text{if} \, \pm (x^0 - y^0) \gt 0 \\ 0 & \vert & \text{otherwise} } \right. \end{aligned}

where ω(k)\omega(\vec k) denotes the dispersion relation (118) of the Klein-Gordon equation. The last step is simply the application of Euler's formula sin(α)=12i(e iαe iα)\sin(\alpha) = \tfrac{1}{2 i }\left( e^{i \alpha} - e^{- i \alpha}\right).

Here the key step is the application of Cauchy's integral formula in the fourth step. We spell this out now for Δ +\Delta_+, the discussion for Δ \Delta_- is the same, just with the appropriate signs reversed.

  1. If (x 0y 0)>0(x^0 - y^0) \gt 0 thn the expression e ik 0(x 0y 0)e^{ik_0 (x^0 - y^0)} decays with positive imaginary part of k 0k_0, so that we may expand the integration domain into the upper half plane as
dk 0 =+ 0dk 0+ 0 +idk 0 =+ +i 0dk 0+ 0 dk 0; \begin{aligned} \int_{-\infty}^\infty d k_0 & = \phantom{+} \int_{-\infty}^0 d k_0 + \int_{0}^{+ i \infty} d k_0 \\ & = + \int_{+i \infty}^0 d k_0 + \int_0^\infty d k_0 \,; \end{aligned}

Conversely, if (x 0y 0)<0(x^0 - y^0) \lt 0 then we may analogously expand into the lower half plane.

  1. This integration domain may then further be completed to two contour integrations. For the expansion into the upper half plane these encircle counter-clockwise the poles at ±ω(k)+iϵ\pm \omega(\vec k)+ i\epsilon \in \mathbb{C}, while for expansion into the lower half plane no poles are being encircled.

  1. Apply Cauchy's integral formula to find in the case (x 0y 0)>0(x^0 - y^0)\gt 0 the sum of the residues at these two poles times 2πi2\pi i, zero in the other case. (For the retarded propagator we get 2πi- 2 \pi i times the residues, because now the contours encircling non-trivial poles go clockwise).

  2. The result is now non-singular at ϵ=0\epsilon = 0 and therefore the limit ϵ0\epsilon \to 0 is now computed by evaluating at ϵ=0\epsilon = 0.

This computation shows a) that the limiting distribution indeed exists, and b) that the support of Δ +\Delta_+ is in the future, and that of Δ \Delta_- is in the past.

Hence it only remains to see now that the support of Δ ±\Delta_\pm is inside the causal cone. But this follows from the previous argument, by using that the Klein-Gordon equation is invariant under Lorentz transformations: This implies that the support is in fact in the future of every spacelike slice through the origin in p,1\mathbb{R}^{p,1}, hence in the closed future cone of the origin.

Corollary

(causal propagator is skew-symmetric)

Under reversal of arguments the advanced and retarded causal propagators from prop. 64 are related by

Δ ±(yx)=Δ (xy). \Delta_{\pm}(y-x) = \Delta_\mp(x-y) \,.

It follows that the causal propagator ΔΔ +Δ \Delta \coloneqq \Delta_+ - \Delta_- is skew-symmetric in its arguments:

Δ S(xy)=Δ S(yx). \Delta_S(x-y) = - \Delta_S(y-x) \,.
Proof

By prop. 64 we have with (119)

Δ ±(yx) ={±i(2π) p12ω(k)/c(e iω(k)(x 0y 0)/cik(xy)e +iω(k)(x 0y 0)/cik(xy))d pk | if(x 0y 0)>0 0 | otherwise ={±i(2π) p12ω(k)/c(e iω(k)(x 0y 0)/c+ik(xy)e +iω(k)(x 0y 0)/cik(xy))d pk | if(x 0y 0)>0 0 | otherwise ={i(2π) p12ω(k)/c(e +iω(k)(x 0y 0)/c+ik(xy)e iω(k)(x 0y 0)/cik(xy))d pk | if(x 0y 0)>0 0 | otherwise =Δ (xy) \begin{aligned} \Delta_\pm(y-x) & = \left\{ \array{ \frac{\pm i}{(2\pi)^{p}} \int \frac{1}{2\omega(\vec k)/c} \left( e^{-i \omega(\vec k)(x^0 - y^0)/c - i \vec k \cdot (\vec x -\vec y)} - e^{+i \omega(\vec k)(x^0 - y^0)/c - i \vec k \cdot (\vec x - \vec y) } \right) d^p \vec k & \vert & \text{if} \, \mp (x^0 - y^0) \gt 0 \\ 0 & \vert & \text{otherwise} } \right. \\ & = \left\{ \array{ \frac{\pm i}{(2\pi)^{p}} \int \frac{1}{2\omega(\vec k)/c} \left( e^{-i \omega(\vec k)(x^0 - y^0)/c + i \vec k \cdot (\vec x -\vec y)} - e^{+i \omega(\vec k)(x^0 - y^0)/c - i \vec k \cdot (\vec x - \vec y) } \right) d^p \vec k & \vert & \text{if} \, \mp (x^0 - y^0) \gt 0 \\ 0 & \vert & \text{otherwise} } \right. \\ & = \left\{ \array{ \frac{\mp i}{(2\pi)^{p}} \int \frac{1}{2\omega(\vec k)/c} \left( e^{+i \omega(\vec k)(x^0 - y^0)/c + i \vec k \cdot (\vec x -\vec y)} - e^{-i \omega(\vec k)(x^0 - y^0)/c - i \vec k \cdot (\vec x - \vec y) } \right) d^p \vec k & \vert & \text{if} \, \mp (x^0 - y^0) \gt 0 \\ 0 & \vert & \text{otherwise} } \right. \\ & = \Delta_\mp(x-y) \end{aligned}

Here in the second step we applied change of integration variables kk\vec k \mapsto - \vec k (which introduces no sign because in addition to dkdkd \vec k \mapsto - d \vec k the integration domain reverses orientation).

\,

causal propagator

Proposition

(mode expansion of causal propagator for Klein-Gordon equation on Minkowski spacetime)

The causal propagator (84) for the Klein-Gordon equation for mass mm on Minkowski spacetime p,1\mathbb{R}^{p,1} (example 25) is given, in generalized function notation, by

(127)Δ S(x,y) =+i(2π) p12ω(k)/c(e iω(k)(x 0y 0)/c+ik(xy)e iω(k)(x 0y 0)/c+ik(xy))d pk =1(2π) p1ω(k)/csin(ω(k)(x 0y 0)/c)e ik(xy)d pk, \begin{aligned} \Delta_S(x,y) & = \frac{+ i}{(2\pi)^{p}} \int \frac{1}{2\omega(\vec k)/c} \left( e^{i \omega(\vec k)(x^0 - y^0)/c + i \vec k \cdot (\vec x -\vec y)} - e^{-i \omega(\vec k)(x^0 - y^0)/c + i \vec k \cdot (\vec x - \vec y)} \right) d^p \vec k \\ & = \frac{-1}{(2\pi)^p} \int \frac{1}{\omega(\vec k)/c} \sin\left( \omega(\vec k)(x^0 - y^0)/c \right) e^{i \vec k \cdot (\vec x -\vec y)} d^p \vec k \,, \end{aligned}

where in the second line we used Euler's formula sin(α)=12i(e iαe iα)sin(\alpha)= \tfrac{1}{2i}\left( e^{i \alpha} - e^{-i \alpha} \right).

In particular this shows that the causal propagator is real, in that it is equal to its complex conjugate

(128)(Δ S(x,y)) *=Δ S(x,y). \left(\Delta_S(x,y)\right)^\ast = \Delta_S(x,y) \,.
Proof

By definition and using the expression from prop. 64 for the advanced and retarded causal propagators we have

Δ S(x,y) Δ +(x,y)Δ (x,y) ={+i(2π) p12ω(k)/c(e iω(k)(x 0y 0)/c+ik(xy)e iω(k)(x 0y 0)/c+ik(xy))d pk | if+(x 0y 0)>0 (1)(1)i(2π) p12ω(k)/c(e iω(k)(x 0y 0)/c+ik(xy)e iω(k)(x 0y 0)/c+ik(xy))d pk | if(x 0y 0)>0 =+i(2π) p12ω(k)/c(e iω(k)(x 0y 0)/c+ik(xy)e iω(k)(x 0y 0)/c+ik(xy))d pk =1(2π) p1ω(k)/csin(ω(k)(x 0y 0)/c)e ik(xy)d pk \begin{aligned} \Delta_S(x,y) & \coloneqq \Delta_+(x,y) - \Delta_-(x,y) \\ & = \left\{ \array{ \frac{+ i}{(2\pi)^{p}} \int \frac{1}{2\omega(\vec k)/c} \left( e^{i \omega(\vec k)(x^0 - y^0)/c + i \vec k \cdot (\vec x -\vec y)} - e^{-i \omega(\vec k)(x^0 - y^0)/c + i \vec k \cdot (\vec x - \vec y)} \right) d^p \vec k & \vert & \text{if} \, + (x^0 - y^0) \gt 0 \\ \frac{(-1) (-1) i}{(2\pi)^{p}} \int \frac{1}{2\omega(\vec k)/c} \left( e^{i \omega(\vec k)(x^0 - y^0)/c + i \vec k \cdot (\vec x -\vec y)} - e^{-i \omega(\vec k)(x^0 - y^0)/c + i \vec k \cdot (\vec x - \vec y)} \right) d^p \vec k & \vert & \text{if} \, - (x^0 - y^0) \gt 0 } \right. \\ & = \frac{+ i}{(2\pi)^{p}} \int \frac{1}{2\omega(\vec k)/c} \left( e^{i \omega(\vec k)(x^0 - y^0)/c + i \vec k \cdot (\vec x -\vec y)} - e^{-i \omega(\vec k)(x^0 - y^0)/c + i \vec k \cdot (\vec x - \vec y)} \right) d^p \vec k \\ & = \frac{-1}{(2\pi)^p} \int \frac{1}{\omega(\vec k)/c} \sin\left( \omega(\vec k)(x^0 - y^0)/c \right) e^{i \vec k \cdot (\vec x -\vec y)} d^p \vec k \end{aligned}

For the reality, notice from the last line that

(Δ S(x,y)) * =1(2π) p1ω(k)/csin(ω(k)(x 0y 0)/c)e ik(xy)d pk =1(2π) p1ω(k)/csin(ω(k)(x 0y 0)/c)e +ik(xy)d pk =Δ S(x,y), \begin{aligned} \left(\Delta_S(x,y)\right)^\ast & = \frac{-1}{(2\pi)^p} \int \frac{1}{\omega(\vec k)/c} \sin\left( \omega(\vec k)(x^0 - y^0)/c \right) e^{-i \vec k \cdot (\vec x -\vec y)} d^p \vec k \\ & = \frac{-1}{(2\pi)^p} \int \frac{1}{\omega(\vec k)/c} \sin\left( \omega(\vec k)(x^0 - y^0)/c \right) e^{+i \vec k \cdot (\vec x -\vec y)} d^p \vec k \\ & = \Delta_S(x,y) \,, \end{aligned}

where in the last step we used the change of integration variables kk\vec k \mapsto - \vec k (whih introduces no sign, since on top of dkdkd \vec k \mapsto - d \vec k the orientation of the integration domain changes).

We consider a couple of equivalent expressions for the causal propagator which are useful for computations:

Proposition

(causal propagator for Klein-Gordon operator on Minkowski spacetime as a contour integral)

The causal propagator (prop. 42) for the Klein-Gordon equation at mass mm on Minkowski spacetime (example 25) has the following equivalent expression, as a generalized function, given as a contour integral along a Jordan curve C(k)C(\vec k) going counter-clockwise around the two poles at k 0=±ω(k)/ck_0 = \pm \omega(\vec k)/c:

Δ S(x,y)=(2π) (p+1)C(k)e ik μ(xy) μk μk μ(mc) 2gdk 0d pk. \Delta_S(x,y) \;=\; (2\pi)^{-(p+1)} \int \underset{C(\vec k)}{\oint} \frac{e^{i k_\mu (x-y)^\mu}}{ -k_\mu k^\mu - \left( \tfrac{m c}{\hbar} \right)^2g } \,d k_0 \,d^{p} k \,.

graphics grabbed from Kocic 16

Proof

By Cauchy's integral formula we compute as follows:

(2π) (p+1)C(k)e ik μ(x μy μ)k μk μ(mc) 2dk 0d pk =(2π) (p+1)C(k)e ik 0x 0e ik(xy)k 0 2ω(k) 2/c 2dk 0d pk =(2π) (p+1)C(k)e ik 0(x 0y 0)e ik(xy)(k 0+ω(k)/c)(k 0ω(k)/c)dk 0d pk =(2π) (p+1)2πi(e iω(k)(x 0y 0)/ce ik(xy)2ω(k)/ce iω(k)(x 0y 0)/ce ik(xy)2ω(k)/c)d pk =i(2π) p1ω(k)/csin(ω(k)(x 0y 0)/c)e ik(xy)d pk. \begin{aligned} (2\pi)^{-(p+1)} \int \underset{C(\vec k)}{\oint} \frac{e^{i k_\mu (x^\mu - y^\mu)}}{ -k_\mu k^\mu - \left( \tfrac{m c}{\hbar} \right)^2 } \,d k_0 \,d^{p} k & = (2\pi)^{-(p+1)} \int \underset{C(\vec k)}{\oint} \frac{ e^{i k_0 x^0} e^{ i \vec k \cdot (\vec x - \vec y)} }{ k_0^2 - \omega(\vec k)^2/c^2 } \,d k_0 \,d^p \vec k \\ & = (2\pi)^{-(p+1)} \int \underset{C(\vec k)}{\oint} \frac{ e^{i k_0 (x^0 - y^0)} e^{i \vec k \cdot (\vec x - \vec y)} }{ ( k_0 + \omega(\vec k)/c ) ( k_0 - \omega(\vec k)/c ) } \,d k_0 \,d^p \vec k \\ & = (2\pi)^{-(p+1)} 2\pi i \int \left( \frac{ e^{i \omega(\vec k) (x^0 - y^0)/c} e^{i \vec k \cdot (\vec x - \vec y)} } { 2 \omega(\vec k)/c } - \frac{ e^{ - i \omega(\vec k) (x^0 - y^0)/c} e^{i \vec k \cdot (\vec x - \vec y)} }{ 2 \omega(\vec k)/c } \right) \,d^p \vec k \\ & = i (2\pi)^{-p} \int \frac{1}{\omega(\vec k)/c} sin\left( \omega(\vec k)(x^0 - y^0)/c \right) e^{i \vec k \cdot (\vec x - \vec y)} \,d^p \vec k \,. \end{aligned}

The last line is the expression for the causal propagator from prop. 65

Proposition

(causal propagator as Fourier transform of delta distribution on the Fourier transformed Klein-Gordon operator)

The causal propagator for the Klein-Gordon equation at mass mm on Minkowski spacetime has the following equivalent expression, as a generalized function:

Δ S(x,y)=i(2π) pδ(k μk μ+(mc) 2)sgn(k 0)e ik μ(xy) μd p+1k, \Delta_S(x,y) \;=\; i (2\pi)^{-p} \int \delta\left( k_\mu k^\mu + \left( \tfrac{m c}{\hbar}\right)^2 \right) sgn( k_0 ) e^{ i k_\mu (x-y)^\mu } d^{p+1} k \,,

where the integrand is the product of the sign function of k 0k_0 with the delta distribution of the Fourier transform of the Klein-Gordon operator and a plane wave factor.

Proof

By decomposing the integral over k 0k_0 into its negative and its positive half, and applying the change of integration variables k 0=±hk_0 = \pm\sqrt{h} we get

i(2π) pδ(k μk μ+(mc) 2)sgn(k 0)e ik μ(xy) μd p+1k =+i(2π) p 0 δ(k 0 2+k 2+(mc) 2)e ik 0(x 0y 0)+ik(xy)dk 0d pk =i(2π) p 0δ(k 0 2+k 2+(mc) 2)e ik 0(x 0y 0)+ik(xy)dk 0d pk =+i(2π) p 0 12hδ(h+ω(k) 2/c 2)e +ih(x 0y 0)+ikxdhd pk =i(2π) p 0 12hδ(h+ω(k) 2/c 2)e ih(x 0y 0)+ikxdhd pk =+i(2π) p12ω(k)/ce iω(k)(xy) 0/c+ikxd pk =i(2π) p12ω(k)/ce iω(k)(xy) 0/c+ikxd pk =(2π) p1ω(k)/csin(ω(k)(xy) 0/c)e ik(xy) \begin{aligned} i (2\pi)^{-p} \int \delta\left( k_\mu k^\mu + \left( \tfrac{m c}{\hbar}\right)^2 \right) sgn( k_0 ) e^{ i k_\mu (x-y)^\mu } d^{p+1} k & = + i (2\pi)^{-p} \int \int_0^\infty \delta\left( -k_0^2 + \vec k^2 + \left( \tfrac{m c}{\hbar}\right)^2 \right) e^{ i k_0 (x^0 - y^0) + i \vec k \cdot (\vec x - \vec y)} d k_0 \, d^p \vec k \\ & \phantom{=} - i (2\pi)^{-p} \int \int_{-\infty}^0 \delta\left( -k_0^2 + \vec k^2 + \left(\tfrac{m c}{\hbar}\right)^2 \right) e^{ i k_0 (x^0 - y^0)+ i \vec k \cdot (\vec x - \vec y) } d k_0 \, d^{p} \vec k \\ & = +i (2\pi)^{-p} \int \int_0^\infty \frac{1}{2 \sqrt{h}} \delta\left( -h + \omega(\vec k)^2/c^2 \right) e^{ + i \sqrt{h} (x^0 - y^0) + i \vec k \cdot \vec x } d h \, d^{p} \vec k \\ & \phantom{=} - i (2\pi)^{-p} \int \int_0^\infty \frac{1}{2 \sqrt{h}} \delta\left( - h + \omega(\vec k)^2/c^2 \right) e^{ - i \sqrt{h} (x^0 - y^0) + i \vec k \cdot \vec x } d h \, d^{p} \vec k \\ & = +i (2\pi)^{-p} \int \frac{1}{2 \omega(\vec k)/c} e^{ i \omega(\vec k) (x-y)^0/c + i \vec k \cdot \vec x} d^{p} \vec k \\ & \phantom{=} - i (2\pi)^{-p} \int \frac{1}{2 \omega(\vec k)/c} e^{ - i \omega(\vec k) (x-y)^0/c + i \vec k \cdot \vec x } d^{p} \vec k \\ & = -(2 \pi)^{-p} \int \frac{1}{\omega(\vec k)/c} sin\left( \omega(\vec k)(x-y)^0/c \right) e^{i \vec k \cdot (\vec x - \vec y)} \end{aligned}

The last line is the expression for the causal propagator from prop. 65.

\,

Hadamard propagator

Prop. 67 exhibits the causal propagator of the Klein-Gordon operator on Minkowski spacetime as the difference of a contribution for positive temporal angular frequency k 0ω(k)k_0 \propto \omega(\vec k) (hence positive energy ω(k)\hbar \omega(\vec k) and a contribution of negative temporal angular frequency.

The positive frequency contribution to the causal propagator is called the Hadamard propagator (def. 107 below), also known as the the vacuum state 2-point function of the free real scalar field on Minkowski spacetime. Notice that the temporal component of the wave vector is proportional to the negative angular frequency

k 0=ω/c k_0 = -\omega/c

(see at plane wave), therefore the appearance of the step function Θ(k 0)\Theta(-k_0) in (129) below:

Definition

(Hadamard propagator or vacuum state 2-point function for Klein-Gordon operator on Minkowski spacetime)

The Hadamard propagator for the Klein-Gordon operator at mass mm on Minkowski spacetime (example 25) is the tempered distribution in two variables Δ H𝒮( p,1)\Delta_H \in \mathcal{S}'(\mathbb{R}^{p,1}) which as a generalized function is given by the expression

(129)Δ H(x,y) 1(2π) pδ(k μk μ+m 2)Θ(k 0)e ik μ(x μy μ)d p+1k =1(2π) p12ω(k)/ce iω(k)(x 0y 0)/c+ik(xy)d pk, \begin{aligned} \Delta_H(x,y) & \coloneqq \frac{1}{(2\pi)^p} \int \delta\left( k_\mu k^\mu + m^2 \right) \Theta( -k_0 ) e^{i k_\mu (x^\mu-y^\mu) } \, d^{p+1} k \\ & = \frac{1}{(2\pi)^p} \int \frac{1}{2 \omega(\vec k)/c} e^{-i \omega(\vec k)(x^0 - y^0)/c + i \vec k \cdot (\vec x - \vec y) } \, d^p \vec k \,, \end{aligned}

Here in the first line we have in the integrand the delta distribution of the Fourier transform of the Klein-Gordon operator times a plane wave and times the step function Θ\Theta of the temporal component of the wave vector. In the second line we used the change of integration variables k 0=hk_0 = \sqrt{h}, then the definition of the delta distribution and the fact that ω(k)\omega(\vec k) is by definition the non-negative solution to the Klein-Gordon dispersion relation.

(e.g. Khavkine-Moretti 14, equation (38) and section 3.4)

Proposition

(contour integral representation of the Hadamard propagator for the Klein-Gordon operator on Minkowski spacetime)

The Hadamard propagator from def. 107 is equivalently given by the contour integral

(130)Δ H(x,y)=i(2π) (p+1)C +(k)e ik μ(xy) μk μk μ(mc) 2dk 0d pk, \Delta_H(x,y) \;=\; -i(2\pi)^{-(p+1)} \int \underset{C_+(\vec k)}{\oint} \frac{e^{-i k_\mu (x-y)^\mu}}{ -k_\mu k^\mu - \left( \tfrac{m c}{\hbar} \right)^2 } d k_0 d^{p} k \,,

where the Jordan curve C +(k)C_+(\vec k) \subset \mathbb{C} runs counter-clockwise, enclosing the point +ω(k)/c+ \omega(\vec k)/c \in \mathbb{R} \subset \mathbb{C}, but not enclosing the point ω(k)/c- \omega(\vec k)/c \in \mathbb{R} \subset \mathbb{C}.

graphics grabbed from Kocic 16

Proof

We compute as follows:

i(2π) (p+1)C +(k)e ik μ(xy) μk μk μ(mc) 2dk 0d pk =i(2π) (p+1) C +(k)e ik 0x 0e ik(xy)k 0 2ω(k) 2/c 2dk 0d pk =i(2π) (p+1)C +(k)e ik 0(x 0y 0)e ik(xy)(k 0ω ϵ(k))(k 0+ω ϵ(k))dk 0d pk =(2π) p12ω(k)e iω(k)(x 0y 0)/ce ik(xy)d pk. \begin{aligned} -i(2\pi)^{-(p+1)} \int \underset{C_+(\vec k)}{\oint} \frac{e^{ - i k_\mu (x-y)^\mu}}{ -k_\mu k^\mu - \left( \tfrac{m c}{\hbar} \right)^2 } d k_0 d^{p} k & = -i(2\pi)^{-(p+1)} \int \oint_{C_+(\vec k)} \frac{ e^{ -i k_0 x^0} e^{i \vec k \cdot (\vec x - \vec y)} }{ k_0^2 - \omega(\vec k)^2/c^2 } d k_0 d^p \vec k \\ & = -i(2\pi)^{-(p+1)} \int \underset{C_+(\vec k)}{\oint} \frac{ e^{ - i k_0 (x^0-y^0)} e^{i \vec k \cdot (\vec x - \vec y)} }{ ( k_0 - \omega_\epsilon(\vec k) ) ( k_0 + \omega_\epsilon(\vec k) ) } d k_0 d^p \vec k \\ & = (2\pi)^{-p} \int \frac{1}{2 \omega(\vec k)} e^{-i \omega(\vec k) (x^0-y^0)/c} e^{i \vec k \cdot (\vec x - \vec y)} d^p \vec k \,. \end{aligned}

The last step is application of Cauchy's integral formula, which says that the contour integral picks up the residue of the pole of the integrand at +ω(k)/c+ \omega(\vec k)/c \in \mathbb{R} \subset \mathbb{C}. The last line is Δ H(x,y)\Delta_H(x,y), by definition 107.

Proposition

(skew-symmetric part of Hadamard propagator is the causal propagator)

The Hadamard propagator for the Klein-Gordon equation on Minkowski spacetime (def. 107) is of the form

(131)Δ H =i2Δ S+H =i2(Δ +Δ )+H, \begin{aligned} \Delta_H & = \tfrac{i}{2} \Delta_S + H \\ & = \tfrac{i}{2} \left( \Delta_+ - \Delta_- \right) + H \end{aligned} \,,

where

  1. Δ S\Delta_S is the causal propagator (prop. 64), which is real (128) and skew-symmetric (prop. 2)

    (Δ S(x,y)) *=Δ S(x,y)AA,AAΔ S(y,x)=Δ S(x,y) (\Delta_S(x,y))^\ast = \Delta_S(x,y) \phantom{AA} \,, \phantom{AA} \Delta_S(y,x) = - \Delta_S(x,y)
  2. HH is real and symmetric

    (H(x,y)) *=H(x,y)AA,AAH(y,x)=H(x,y) (H(x,y))^\ast = H(x,y) \phantom{AA} \,, \phantom{AA} H(y,x) = H(x,y)
Proof

By applying Euler's formula to (129) we obtain

(132)Δ H(x,y) =1(2π) p12ω(k)/ce iω(k)(x 0y 0)/c+ik(xy)d pk =i21(2π) p1ω(k)/csin(ω(k)(x 0y 0)/c)e ik(xy)d pk=Δ S(x,y)+1(2π) p12ω(k)/ccos(ω(k)(x 0y 0)/c)e ik(xy)d pkH(x,y) \begin{aligned} \Delta_H(x,y) & = \frac{1}{(2\pi)^p} \int \frac{1}{2 \omega(\vec k)/c} e^{-i \omega(\vec k)(x^0 - y^0)/c + i \vec k \cdot (\vec x - \vec y) } \, d^p \vec k \\ & = \tfrac{i}{2} \underset{= \Delta_S(x,y)}{ \underbrace{ \frac{-1}{(2\pi)^p} \int \frac{1}{\omega(\vec k)/c} \sin\left( \omega(\vec k)(x^0 - y^0)/c \right) e^{i \vec k \cdot (\vec x - \vec y) } \, d^p \vec k }} \;+\; \underset{ \coloneqq H(x,y) }{ \underbrace{ \frac{1}{(2\pi)^p} \int \frac{1}{2 \omega(\vec k)/c} \cos\left( \omega(\vec k)(x^0 - y^0)/c \right) e^{i \vec k \cdot (\vec x - \vec y) } \, d^p \vec k }} \end{aligned}

On the left this identifies the causal propagator by (127), prop. 65.

The second summand changes, both under complex conjugation as well as under (xy)(yx)(x-y) \mapsto (y-x), via change of integration variables kk\vec k \mapsto - \vec k (because the cosine is an even function). This does not change the integral, and hence HH is symmetric.

\,

Feynman propagator

We have seen that the positive frequency component of the causal propagator Δ S\Delta_S for the Klein-Gordon equation on Minkowski spacetime (prop. 64) is the Hadamard propagator Δ H\Delta_H (def. 107) given, according to prop. 69, by (131)

Δ H =i2Δ S+H =i2(Δ +Δ )+H, \begin{aligned} \Delta_H & = \tfrac{i}{2} \Delta_S + H \\ & = \tfrac{i}{2} \left( \Delta_+ - \Delta_- \right) + H \end{aligned} \,,

There is an evident variant of this combination, which will be of interest:

Definition

(Feynman propagator for Klein-Gordon equation on Minkowski spacetime)

The Feynman propagator for the Klein-Gordon equation on Minkowski spacetime (example 25) is the linear combination

Δ Fi2(Δ ++Δ )+H \Delta_F \coloneqq \tfrac{i}{2} \left( \Delta_+ + \Delta_- \right) + H

where the first term is proportional to the sum of the advanced and retarded propagators (prop. 64) and the second is the symmetric part of the Hadamard propagator according to prop. 69.

Similarly the anti-Feynman propagator is

Δ F¯i2(Δ ++Δ )H. \Delta_{\overline{F}} \coloneqq \tfrac{i}{2} \left( \Delta_+ + \Delta_- \right) - H \,.
Proposition

(mode expansion for Feynman propagator of Klein-Gordon equation on Minkowski spacetime)

The Feynman propagator (def. 108) for the Klein-Gordon equation on Minkowski spacetime is given by the following equivalent expressions

Δ F(x,y) ={1(2π) p1ω(k)ce iω(k)(x 0y 0)/ce ik(xy)d pk | (x 0y 0)>0 1(2π) p12ω(k)ce +iω(k)(x 0y 0)/ce ik(xy)d pk | (x 0y 0)<0 ={Δ H(x,y) | (x 0y 0)>0 Δ H(y,x) | (x 0y 0)<0 \begin{aligned} \Delta_F(x,y) & = \left\{ \array{ \frac{1}{(2\pi)^p} \int \frac{1}{\omega(\vec k)c} e^{-i\omega(\vec k)(x^0 - y^0)/c} e^{i \vec k \cdot (\vec x - \vec y)} \, d^p \vec k &\vert& (x^0 - y^0) \gt 0 \\ \frac{1}{(2\pi)^p} \int \frac{1}{2\omega(\vec k)c} e^{+i\omega(\vec k)(x^0 - y^0)/c} e^{i \vec k \cdot (\vec x - \vec y)} \, d^p \vec k &\vert& (x^0 - y^0) \lt 0 } \right. \\ & = \left\{ \array{ \Delta_H(x,y) &\vert& (x^0 - y^0) \gt 0 \\ \Delta_H(y,x) &\vert& (x^0 - y^0) \lt 0 } \right. \end{aligned}

Similarly the anti-Feynman propagator is equivalently given by

Δ F¯(x,y) ={(2π) p1ω(k)ce +iω(k)(x 0y 0)/ce ik(xy)d pk | (x 0y 0)>0 (2π) p12ω(k)ce iω(k)(x 0y 0)/ce ik(xy)d pk | (x 0y 0)<0 ={Δ H(y,x) | (x 0y 0)>0 Δ H(x,y) | (x 0y 0)<0 \begin{aligned} \Delta_{\overline{F}}(x,y) & = \left\{ \array{ \frac{-}{(2\pi)^p} \int \frac{1}{\omega(\vec k)c} e^{+i\omega(\vec k)(x^0 - y^0)/c} e^{i \vec k \cdot (\vec x - \vec y)} \, d^p \vec k &\vert& (x^0 - y^0) \gt 0 \\ \frac{-}{(2\pi)^p} \int \frac{1}{2\omega(\vec k)c} e^{-i\omega(\vec k)(x^0 - y^0)/c} e^{i \vec k \cdot (\vec x - \vec y)} \, d^p \vec k &\vert& (x^0 - y^0) \lt 0 } \right. \\ & = \left\{ \array{ -\Delta_H(y,x) &\vert& (x^0 - y^0) \gt 0 \\ -\Delta_H(x,y) &\vert& (x^0 - y^0) \lt 0 } \right. \end{aligned}
Proof

By the mode expansion of Δ ±\Delta_{\pm} from (119) and the mode expansion of HH from (132) we have

Δ F(x,y) ={i(2π) p12ω(k)/csin(ω(k)(x 0y 0)/c)e ik(xy)d pk=i2Δ +(x,y)+0for(x 0y 0)>0+1(2π) p12ω(k)/ccos(ω(k)(x 0y 0)/c)e ik(xy)d pk=H(x,y) | (x 0y 0)>0 +i(2π) p12ω(k)/csin(ω(k)(x 0y 0)/c)e ik(xy)d pk=0+i2Δ (x,y)for(x 0y 0)<0+1(2π) p12ω(k)/ccos(ω(k)(x 0y 0)/c)e ik(xy)d pk=H(x,y) | (x 0y 0)<0 ={1(2π) p12ω(k)ce iω(k)(x 0y 0)/ce ik(xy)d pk | (x 0y 0)>0 1(2π) p12ω(k)ce +iω(k)(x 0y 0)/ce ik(xy)d pk | (x 0y 0)<0 ={Δ H(x,y) | (x 0y 0)>0 Δ H(y,x) | (x 0y 0)<0 \begin{aligned} \Delta_F(x,y) & = \left\{ \array{ \underset{ = \tfrac{i}{2} \Delta_+(x,y) + 0 \;\text{for}\; (x^0 - y^0) \gt 0 }{ \underbrace{ \frac{- i}{(2\pi)^{p}} \int \frac{1}{2 \omega(\vec k)/c} \sin\left( \omega(\vec k)(x^0 - y^0)/c \right) e^{i \vec k \cdot (\vec x - \vec y) } \, d^p \vec k } } + \underset{ = H(x,y) }{ \underbrace{ \frac{1}{(2\pi)^p} \int \frac{1}{2 \omega(\vec k)/c} \cos\left( \omega(\vec k)(x^0 - y^0)/c \right) e^{i \vec k \cdot (\vec x - \vec y) } \, d^p \vec k } } &\vert& (x^0 - y^0) \gt 0 \\ \underset{ = 0 + \tfrac{i}{2}\Delta_-(x,y) \;\text{for}\; (x^0 - y^0) \lt 0 }{ \underbrace{ \frac{+ i}{(2\pi)^{p}} \int \frac{1}{2 \omega(\vec k)/c} \sin\left( \omega(\vec k)(x^0 - y^0)/c \right) e^{i \vec k \cdot (\vec x - \vec y) } \, d^p \vec k } } + \underset{ = H(x,y) }{ \underbrace{ \frac{1}{(2\pi)^p} \int \frac{1}{2 \omega(\vec k)/c} \cos\left( \omega(\vec k)(x^0 - y^0)/c \right) e^{i \vec k \cdot (\vec x - \vec y) } \, d^p \vec k } } &\vert& (x^0 - y^0) \lt 0 } \right. \\ & = \left\{ \array{ \frac{1}{(2\pi)^p} \int \frac{1}{2\omega(\vec k)c} e^{-i\omega(\vec k)(x^0 - y^0)/c} e^{i \vec k \cdot (\vec x - \vec y)} \, d^p \vec k &\vert& (x^0 - y^0) \gt 0 \\ \frac{1}{(2\pi)^p} \int \frac{1}{2\omega(\vec k)c} e^{+i\omega(\vec k)(x^0 - y^0)/c} e^{i \vec k \cdot (\vec x - \vec y)} \, d^p \vec k &\vert& (x^0 - y^0) \lt 0 } \right. \\ & = \left\{ \array{ \Delta_H(x,y) &\vert& (x^0 - y^0) \gt 0 \\ \Delta_H(y,x) &\vert& (x^0 - y^0) \lt 0 } \right. \end{aligned}

where in the second line we used Euler's formula. The last line follows by comparison with (129) and using that the integral over k\vec k is invariant under kk\vec k \mapsto - \vec k.

The computation for Δ F¯\Delta_{\overline{F}} is the same, only now with a minus sign in front of the cosine:

Δ F¯(x,y) ={i(2π) p12ω(k)/csin(ω(k)(x 0y 0)/c)e ik(xy)d pk=i2Δ +(x,y)+0for(x 0y 0)>01(2π) p12ω(k)/ccos(ω(k)(x 0y 0)/c)e ik(xy)d pk=H(x,y) | (x 0y 0)>0 +i(2π) p12ω(k)/csin(ω(k)(x 0y 0)/c)e ik(xy)d pk=0+i2Δ (x,y)for(x 0y 0)<01(2π) p12ω(k)/ccos(ω(k)(x 0y 0)/c)e ik(xy)d pk=H(x,y) | (x 0y 0)<0 ={1(2π) p12ω(k)ce +iω(k)(x 0y 0)/ce ik(xy)d pk | (x 0y 0)>0 1(2π) p12ω(k)ce 1iω(k)(x 0y 0)/ce ik(xy)d pk | (x 0y 0)<0 ={Δ H(y,x) | (x 0y 0)>0 Δ H(x,y) | (x 0y 0)<0 \begin{aligned} \Delta_{\overline{F}}(x,y) & = \left\{ \array{ \underset{ = \tfrac{i}{2} \Delta_+(x,y) + 0 \;\text{for}\; (x^0 - y^0) \gt 0 }{ \underbrace{ \frac{- i}{(2\pi)^{p}} \int \frac{1}{2 \omega(\vec k)/c} \sin\left( \omega(\vec k)(x^0 - y^0)/c \right) e^{i \vec k \cdot (\vec x - \vec y) } \, d^p \vec k } } - \underset{ = H(x,y) }{ \underbrace{ \frac{1}{(2\pi)^p} \int \frac{1}{2 \omega(\vec k)/c} \cos\left( \omega(\vec k)(x^0 - y^0)/c \right) e^{i \vec k \cdot (\vec x - \vec y) } \, d^p \vec k } } &\vert& (x^0 - y^0) \gt 0 \\ \underset{ = 0 + \tfrac{i}{2}\Delta_-(x,y) \;\text{for}\; (x^0 - y^0) \lt 0 }{ \underbrace{ \frac{+ i}{(2\pi)^{p}} \int \frac{1}{2 \omega(\vec k)/c} \sin\left( \omega(\vec k)(x^0 - y^0)/c \right) e^{i \vec k \cdot (\vec x - \vec y) } \, d^p \vec k } } - \underset{ = H(x,y) }{ \underbrace{ \frac{1}{(2\pi)^p} \int \frac{1}{2 \omega(\vec k)/c} \cos\left( \omega(\vec k)(x^0 - y^0)/c \right) e^{i \vec k \cdot (\vec x - \vec y) } \, d^p \vec k } } &\vert& (x^0 - y^0) \lt 0 } \right. \\ & = \left\{ \array{ \frac{-1}{(2\pi)^p} \int \frac{1}{2\omega(\vec k)c} e^{+i\omega(\vec k)(x^0 - y^0)/c} e^{i \vec k \cdot (\vec x - \vec y)} \, d^p \vec k &\vert& (x^0 - y^0) \gt 0 \\ \frac{-1}{(2\pi)^p} \int \frac{1}{2\omega(\vec k)c} e^{-1i\omega(\vec k)(x^0 - y^0)/c} e^{i \vec k \cdot (\vec x - \vec y)} \, d^p \vec k &\vert& (x^0 - y^0) \lt 0 } \right. \\ & = \left\{ \array{ - \Delta_H(y,x) &\vert& (x^0 - y^0) \gt 0 \\ - \Delta_H(x,y) &\vert& (x^0 - y^0) \lt 0 } \right. \end{aligned}

As before for the causal propagator, there are equivalent reformulations of the Feynman propagator which are useful for computations:

Proposition

(Feynman propagator as a Cauchy principal value)

The Feynman propagator and anti-Feynman propagator (def. 108) for the Klein-Gordon equation on Minkowski spacetime is equivalently given by the following expressions, respectively:

Δ F(x,y) Δ F¯(x,y)} =limϵ(0,)ϵ0i(2π) p+1 e ik μ(x μy μ)k μk μ(mc) 2±iϵdk 0d pk \begin{aligned} \left. \array{ \Delta_F(x,y) \\ \Delta_{\overline{F}}(x,y) } \right\} & = \underset{ {\epsilon \in (0,\infty)} \atop {\epsilon \to 0} }{\lim} \frac{-i}{(2\pi)^{p+1}} \int \int_{-\infty}^\infty \frac{ e^{i k_\mu (x^\mu - y^\mu)} }{ - k_\mu k^\mu - \left( \tfrac{m c}{\hbar} \right)^2 \pm i \epsilon } \, d k_0 \, d^p \vec k \end{aligned}

where we have a limit of distributions as for the Cauchy principal value (this prop).

Proof

We compute as follows:

limϵ(0,)ϵ0i(2π) p+1 e ik μ(x μy μ)k μk μ(mc) 2±iϵdk 0d pk =limϵ(0,)ϵ0i(2π) p+1 e ik μ(x μy μ)(k 0) 2(ω(k) 2/c 2iϵ)ω ±ϵ(k) 2/c 2dk 0d pk =limϵ(0,)ϵ0i(2π) p+1 e ik μ(x μy μ)(k 0ω ±ϵ(k)/c)(k 0+ω ±ϵ(k)/c)dk 0d pk ={±1(2π) p12ω(k)ce iω(k)(x 0y 0)/ce ik(xy)d pk | (x 0y 0)>0 ±1(2π) p12ω(k)ce ±iω(k)(x 0y 0)/ce ik(xy)d pk | (x 0y 0)<0 ={Δ F(x,y) Δ F¯(x,y) \begin{aligned} \underset{ {\epsilon \in (0,\infty)} \atop {\epsilon \to 0} }{\lim} \frac{-i}{(2\pi)^{p+1}} \int \int_{-\infty}^\infty \frac{ e^{i k_\mu (x^\mu - y^\mu)} }{ - k_\mu k^\mu - \left( \tfrac{m c}{\hbar} \right)^2 \pm i \epsilon } \, d k_0 \, d^p \vec k & = \underset{ {\epsilon \in (0,\infty)} \atop {\epsilon \to 0} }{\lim} \frac{-i}{(2\pi)^{p+1}} \int \int_{-\infty}^\infty \frac{ e^{i k_\mu (x^\mu - y^\mu)} }{ (k_0)^2 - \underset{ \coloneqq \omega_{\pm\epsilon}(\vec k)^2/c^2 }{\underbrace{ \left( \omega(\vec k)^2/c^2 \mp i \epsilon \right) }} } \, d k_0 \, d^p \vec k \\ & = \underset{ {\epsilon \in (0,\infty)} \atop {\epsilon \to 0} }{\lim} \frac{-i}{(2\pi)^{p+1}} \int \int_{-\infty}^\infty \frac{ e^{i k_\mu (x^\mu - y^\mu)} }{ \left( k_0 - \omega_{\pm \epsilon}(\vec k)/c \right) \left( k_0 + \omega_{\pm \epsilon}(\vec k)/c \right) } \, d k_0 \, d^p \vec k \\ & = \left\{ \array{ \frac{\pm 1}{(2\pi)^p} \int \frac{1}{2\omega(\vec k)c} e^{\mp i\omega(\vec k)(x^0 - y^0)/c} e^{i \vec k \cdot (\vec x - \vec y)} \, d^p \vec k &\vert& (x^0 - y^0) \gt 0 \\ \frac{\pm 1}{(2\pi)^p} \int \frac{1}{2\omega(\vec k)c} e^{\pm i\omega(\vec k)(x^0 - y^0)/c} e^{i \vec k \cdot (\vec x - \vec y)} \, d^p \vec k &\vert& (x^0 - y^0) \lt 0 } \right. \\ & = \left\{ \array{ \Delta_F(x,y) \\ \Delta_{\overline{F}}(x,y) } \right. \end{aligned}

Here

  1. In the first step we introduced the complex square root ω ±ϵ(k)\omega_{\pm \epsilon}(\vec k). For this to be compatible with the choice of non-negative square root for ϵ=0\epsilon = 0 in (118) we need to choose that complex square root whose complex phase is one half that of ω(k) 2iϵ\omega(\vec k)^2 - i \epsilon (instead of that plus π). This means that ω +ϵ(k)\omega_{+ \epsilon}(\vec k) is in the lower half plane and ω (k)\omega_-(\vec k) is in the upper half plane.

  2. In the third step we observe that

    1. for (x 0y 0)>0(x^0 - y^0) \gt 0 the integrand decays for positive imaginary part and hence the integration over k 0k_0 may be deformed to a contour which encircles the pole in the upper half plane;

    2. for (x 0y 0)<0(x^0 - y^0) \lt 0 the integrand decays for negative imaginary part and hence the integration over k 0k_0 may be deformed to a contour which encircles the pole in the lower half plane

    and then apply Cauchy's integral formula which picks out 2πi2\pi i times the residue a these poles.

    Notice that when completing to a contour in the lower half plane we pick up a minus signs from the fact that now the contour runs clockwise.

  3. In the fourth step we used prop. 70.

\,

singular support and wave front sets

We now discuss the singular support (def. 99) and the wave front sets (def. 101) of the various propagators for the Klein-Gordon equation on Minkowski spacetime.

Proposition

(singular support of the causal propagator of the Klein-Gordon equation on Minkowski spacetime is the light cone)

The singular support of the causal propagator Δ S\Delta_S for the Klein-Gordon equation on Minkowski spacetime, regarded via translation invariance as a generalized function in a single variable (120) is the light cone of the origin:

supp sing(Δ S)={x p,1||x| η 2=0}. supp_{sing}(\Delta_S) \;=\; \left\{ x \in \mathbb{R}^{p,1} \,\vert\, {\vert x\vert}^2_\eta = 0 \right\} \,.
Proof

By prop. 67 the causal propagator is equivalently the Fourier transform of distributions of the delta distribution of the mass shell times the sign function of the angular frequency; and by basic properties of the Fourier transform this is the convolution of distributions of the separate Fourier transforms:

Δ S(x) δ(η 1(k,k)+(mc) 2)sgn(k 0)^ δ(η 1(k,k)+(mc) 2)^sgn(k 0)^ \begin{aligned} \Delta_S(x) & \propto \widehat{ \delta\left( \eta^{-1}(k,k) + \left( \tfrac{m c}{\hbar}\right)^2 \right) sgn( k_0 ) } \\ &\propto \widehat{\delta\left( \eta^{-1}(k,k) + \left( \tfrac{m c}{\hbar}\right)^2 \right)} \star \widehat{sgn( k_0 )} \end{aligned}

By prop. 63, the singular support of the first convolution factor is the light cone.

The second factor is

sgn(k 0)^ (2Θ(k 0)^1^)δ(k) (21ix 0+0 +δ(x 0))δ(k) \begin{aligned} \widehat{sgn(k_0)} & \propto \left(2\widehat{\Theta(k_0)} - \widehat{1}\right) \delta(\vec k) \\ & \propto \left(2\tfrac{1}{i x^0 + 0^+} - \delta(x^0)\right) \delta(\vec k) \end{aligned}

(by example 97 and example 90) and hence the wave front set (def. 101) of the second factor is

WF(sgn(k 0)^)={(0,k)|kS( p+1)} WF\left(\widehat{sgn(k_0)}\right) = \{(0,k) \;\vert\; k \in S(\mathbb{R}^{p+1})\}

(by example 85 and example 88).

With this the statement follows, via a partition of unity, from this prop..

For illustration we now make this general argument more explicit in the special case of spacetime dimension

p+1=3+1 p + 1 = 3 + 1

by computing an explicit form for the causal propagator in terms of the delta distribution, the Heaviside distribution and smooth Bessel functions.

We follow (Scharf 95 (2.3.18)).

Consider the formula for the causal propagator in terms of the mode expansion (127). Since the integrand here depends on the wave vector k\vec k only via its norm |k|{\vert \vec k\vert} and the angle θ\theta it makes with the given spacetime vector via

k(xy)=|k||x|cos(θ) \vec k \cdot (\vec x - \vec y) \;=\; {\vert \vec k\vert} \, {\vert \vec x\vert} \, \cos(\theta)

we may express the integration in terms of polar coordinates as follws:

Δ S(xy) =1(2π) p12ω(k)/csin(ω(k)(x 0y 0)/c)e ik(xy)d pk =vol S p2(2π) p|k| 0θ[0,π]1ω(k)/csin(ω(k)(x 0y 0)/c)e i|k||xy|cos(θ)|k|(|k|sin(θ)) p2dθd|k| \begin{aligned} \Delta_S(x - y) & = \frac{-1}{(2\pi)^p} \int \frac{1}{2 \omega(\vec k)/c} \sin\left( \omega(\vec k)(x^0 - y^0)/c \right) e^{i \vec k \cdot (\vec x - \vec y)} \, d^p \vec k \\ & = \frac{- vol_{S^{p-2}}}{(2\pi)^p} \underset{{\vert \vec k\vert} \in \mathbb{R}_{\geq 0}}{\int} \underset{ \theta \in [0,\pi] }{\int} \frac{ 1 }{ \omega(\vec k)/c } \sin\left( \omega(\vec k) (x^0 - y^0) /c \right) e^{ i {\vert \vec k\vert} {\vert \vec x - \vec y\vert} \cos(\theta) } {\vert \vec k\vert} ({\vert \vec k\vert} \sin(\theta))^{p-2} \, d \theta \wedge d {\vert \vec k\vert} \end{aligned}

In the special case of spacetime dimension p+1=3+1p + 1 = 3 + 1 this becomes

(133)Δ S(xy) =2π(2π) 3|k| 0|k| 2ω(k)/csin(ω(k)(x 0y 0)/c)cos(θ)[1,1]e i|k||xy|cos(θ)dcos(θ)=1i|k||xy|(e i|k||xy|e i|k||xy|)d|k| =2(2π) 2|xy||k| 0|k|ω(k)/csin(ω(k)(x 0y 0)/c)sin(|k||xy|)d|k| =2(2π) 2|xy|dd|xy||k| 01ω(k)/csin(ω(k)(x 0y 0)/c)cos(|k||xy|)d|k| =1(2π) 2|xy|dd|xy|κ1ω(κ)/csin(ω(κ)(x 0y 0)/c)cos(κ|xy|)dκ =12(2π) 2|xy|dd|xy|(κ1ω(κ)/csin(ω(κ)(x 0y 0)/c+κ|xy|)dκI ++κ1ω(κ)/csin(ω(κ)(x 0y 0)/cκ|xy|)dκI ). \begin{aligned} \Delta_S(x - y) & = \frac{- 2\pi}{(2\pi)^{3}} \underset{{\vert \vec k\vert} \in \mathbb{R}_{\geq 0}}{\int} \frac{ {\vert \vec k \vert}^2 }{ \omega(\vec k)/c } \sin\left( \omega(\vec k) (x^0 - y^0) /c \right) \underset{ = \tfrac{1}{i {\vert \vec k\vert}\, {\vert \vec x - \vec y\vert} } \left( e^{i {\vert \vec k\vert}\, {\vert \vec x - \vec y\vert}} - e^{-i {\vert \vec k\vert}\, {\vert \vec x - \vec y\vert}} \right) }{ \underbrace{ \underset{ \cos(\theta) \in [-1,1] }{\int} e^{ i {\vert \vec k\vert} {\vert \vec x - \vec y\vert} \cos(\theta) } d \cos(\theta) } } \wedge d {\vert \vec k \vert} \\ & = \frac{- 2}{(2\pi)^{2} {\vert \vec x - \vec y\vert}} \underset{{\vert \vec k\vert} \in \mathbb{R}_{\geq 0}}{\int} \frac{ {\vert \vec k \vert} }{ \omega(\vec k)/c } \sin\left( \omega(\vec k) (x^0 - y^0) /c \right) \sin\left( {\vert \vec k\vert}\, {\vert \vec x - \vec y\vert} \right) \, d {\vert \vec k\vert} \\ & = \frac{- 2}{(2\pi)^{2} {\vert \vec x - \vec y\vert}} \frac{d}{d {\vert \vec x - \vec y \vert } } \underset{{\vert \vec k\vert} \in \mathbb{R}_{\geq 0}}{\int} \frac{ 1 }{ \omega(\vec k)/c } \sin\left( \omega(\vec k) (x^0 - y^0) /c \right) \cos\left( {\vert \vec k\vert}\, {\vert \vec x - \vec y\vert} \right) \, d {\vert \vec k\vert} \\ & = \frac{- 1}{(2\pi)^{2} {\vert \vec x - \vec y\vert}} \frac{d}{d {\vert \vec x - \vec y \vert } } \underset{ \kappa \in \mathbb{R} }{\int} \frac{ 1 }{ \omega(\kappa)/c } \sin\left( \omega(\kappa) (x^0 - y^0) /c \right) \cos\left( \kappa\, {\vert \vec x - \vec y\vert} \right) \, d \kappa \\ & = \frac{- 1}{2(2\pi)^{2} {\vert \vec x - \vec y\vert}} \frac{d}{d {\vert \vec x - \vec y} \vert } \left( \underset{\coloneqq I_+}{ \underbrace{ \underset{ \kappa \in \mathbb{R} }{\int} \frac{ 1 }{ \omega(\kappa)/c } \sin\left( \omega(\kappa) (x^0 - y^0) /c + \kappa\, {\vert \vec x - \vec y\vert} \right) d\kappa } } + \underset{ \coloneqq I_- }{ \underbrace{ \underset{ \kappa \in \mathbb{R} }{\int} \frac{ 1 }{ \omega(\kappa)/c } \sin\left( \omega(\kappa) (x^0 - y^0) /c - \kappa\, {\vert \vec x - \vec y\vert} \right) \, d \kappa } } \right) \,. \end{aligned}

Here in the second but last step we renamed κ|k|\kappa \coloneqq {\vert \vec k\vert} and doubled the integration domain for convenience, and in the last step we used the trigonometric identity sin(α)cos(β)=12(sin(α+β)+sin(αβ))\sin(\alpha) \cos(\beta)\;=\; \tfrac{1}{2} \left( \sin(\alpha + \beta) + \sin(\alpha - \beta) \right).

In order to further evaluate this, we parameterize the remaining components (ω/c,κ)(\omega/c, \kappa) of the wave vector by the dual rapidity zz, via

(cosh(z)) 2(sinh(z)) 2=1 \left(\cosh(z)\right)^2 - \left( \sinh(z)\right)^2 = 1

as

ω(κ)/c=(mc)cosh(z)AA,AAκ=(mc)sinh(z), \omega(\kappa)/c \;=\; \left( \tfrac{m c}{\hbar} \right) \cosh(z) \phantom{AA} \,, \phantom{AA} \kappa \;=\; \left( \tfrac{m c}{\hbar} \right) \sinh(z) \,,

which makes use of the fact that ω(κ)\omega(\kappa) is non-negative, by construction. This change of integration variables makes the integrals under the braces above become

(134)I ±= sin(mc((x 0y 0)cosh(z)±|xy|sinh(z)))dz. I_\pm \;=\; \int_{-\infty}^\infty \sin\left( \tfrac{m c}{\hbar} \left( (x^0 - y^0) \cosh(z) \pm {\vert \vec x - \vec y\vert} \sinh(z) \right) \right) \, d z \,.

Next we similarly parameterize the vector xyx-y by its rapidity τ\tau. That parameterization depends on whether xyx-y is spacelike or not, and if not, whether it is future or past directed.

First, if xyx-y is spacelike in that |xy| η 2>0{\vert x-y\vert}^2_\eta \gt 0 then we may parameterize as

(x 0y 0)=|xy| η 2sinh(τ)AA,AA|xy|=|xy| η 2cosh(τ) (x^0 - y^0) = \sqrt{{\vert x-y\vert}^2_\eta} \sinh(\tau) \phantom{AA} \,, \phantom{AA} {\vert \vec x - \vec y\vert} = \sqrt{ {\vert x-y\vert}^2_\eta} \cosh(\tau)

which yields

I ± = sin(mc|xy| η 2(sinh(τ)cosh(z)±cosh(τ)sinh(z)))dz = sin(mc|xy| η 2(sinh(τ±z)))dz = sin(mc|xy| η 2(sinh(z)))dz =0, \begin{aligned} I_{\pm} & = \int_{-\infty}^\infty \sin\left( \tfrac{m c}{\hbar} \sqrt{ {\vert x-y\vert}^2_\eta } \left( \sinh(\tau) \cosh(z) \pm \cosh(\tau) \sinh(z) \right) \right) \, d z \\ & = \int_{-\infty}^\infty \sin\left( \tfrac{m c}{\hbar} \sqrt{ {\vert x-y\vert}^2_\eta} \left( \sinh\left( \tau \pm z\right) \right) \right) \, d z \\ & = \int_{-\infty}^\infty \sin\left( \tfrac{m c}{\hbar} \sqrt{ {\vert x-y\vert}^2_\eta } \left( \sinh\left( z \right) \right) \right) \, d z \\ & = 0 \,, \end{aligned}

where in the last line we observe that the integrand is a skew-symmetric function of zz.

Second, if xyx-y is timelike with (x 0y 0)>0(x^0 - y^0) \gt 0 then we may parameterize as

(x 0y 0)=|xy| η 2cosh(τ)AA,AA|xy|=|xy| η 2sinh(τ) (x^0 - y^0) = \sqrt{ -{\vert x-y\vert}^2_\eta} \cosh(\tau) \phantom{AA} \,, \phantom{AA} {\vert \vec x - \vec y\vert} = \sqrt{ -{\vert x - y\vert}^2_\eta } \sinh(\tau)

which yields

(135)I ± = sin(mc((x 0y 0)cosh(z)±|xy|sinh(z)))dz = sin(|xy| η 2mc(cosh(τ)cosh(z)±cosh(τ)sinh(z)))dz = sin(|xy| η 2mc(cosh(z±τ)))dz =πJ 0(|xy| η 2mc). \begin{aligned} I_\pm & = \int_{-\infty}^\infty \sin\left( \tfrac{m c}{\hbar} \left( (x^0 - y^0) \cosh(z) \pm {\vert \vec x - \vec y\vert} \sinh(z) \right) \right) \, d z \\ & = \int_{-\infty}^\infty \sin\left( \sqrt{ - {\vert x-y\vert}^2_\eta } \tfrac{m c}{\hbar} \left( \cosh(\tau)\cosh(z) \pm \cosh(\tau) \sinh(z) \right) \right) \, d z \\ & = \int_{-\infty}^\infty \sin\left( \sqrt{ - {\vert x-y\vert}^2_\eta } \tfrac{m c}{\hbar} \left( \cosh(z \pm \tau) \right) \right) \, d z \\ & = \pi J_0\left( \sqrt{ - {\vert x-y\vert}^2_\eta} \tfrac{m c}{\hbar} \right) \end{aligned} \,.

Here in the last line we identified the integral representation of the Bessel function J 0J_0 of order 0 (see here). The important point here is that this is a smooth function.

Similarly, if xyx-y is timelike with (x 0y 0)<0(x^0 - y^0) \lt 0 then the same argument yields

I ±=πJ 0(|xy| η 2mc) I_\pm = - \pi J_0\left( \sqrt{ - {\vert x-y\vert}^2_\eta} \tfrac{m c}{\hbar} \right)

In conclusion, the general form of I ±I_\pm is

I ±=πsgn(x 0y 0)Θ(|xy| η 2)J 0(|xy| η 2mc). I_\pm = \pi sgn(x^0 - y^0) \Theta\left( -{\vert x-y\vert}^2_\eta \right) J_0\left( \sqrt{ - {\vert x-y\vert}^2_\eta } \tfrac{m c}{\hbar} \right) \,.

Therefore we end up with

(136)Δ S(x,y) =14π|xy|dd|xy|sgn(x 0)Θ(|xy| η 2)J 0(|xy| η 2mc) =12πdd(|xy| η 2)sgn(x 0)Θ(|xy| η 2)J 0(|xy| η 2mc) =12πdd(|xy| η 2)sgn(x 0)Θ(|xy| η 2)J 0(mc|xy| η 2) =12πsgn(x 0)(δ(|xy| η 2)Θ(|xy| η 2)dd(|xy| η 2)J 0(mc|xy| η 2)) \begin{aligned} \Delta_S(x,y) & = \frac{1}{4 \pi {\vert \vec x - \vec y\vert}} \frac{d}{d {\vert \vec x - \vec y\vert}} sgn(x^0) \Theta\left( -{\vert x-y\vert}^2_\eta \right) J_0\left( \sqrt{ -{\vert x-y\vert}^2_\eta } \tfrac{m c}{\hbar} \right) \\ & = \frac{-1}{2 \pi } \frac{d}{d (-{\vert x-y\vert}^2_\eta)} sgn(x^0) \Theta\left( -{\vert x-y\vert}^2_\eta \right) J_0\left( \sqrt{-{\vert x-y \vert}^2_\eta} \tfrac{m c}{\hbar} \right) \\ & = -\frac{1}{2 \pi } \frac{d}{d (- \vert x-y\vert^2_{\eta})} sgn(x^0) \Theta\left( - {\vert x - y\vert}^2_\eta \right) J_0\left( \tfrac{m c}{\hbar} \sqrt{ -{\vert x-y\vert}^2_\eta } \right) \\ & = \frac{-1}{2\pi} sgn(x^0) \left( \delta\left( -{\vert x-y\vert}^2_\eta \right) \;-\; \Theta\left( -{\vert x-y\vert}^2_\eta \right) \frac{d}{d \left({-\vert x-y\vert}^2_\eta\right) } J_0\left( \tfrac{m c}{\hbar} \sqrt{ -{\vert x-y\vert}^2_\eta } \right) \right) \end{aligned}
Proposition

(singular support of the Hadamard propagator of the Klein-Gordon equation on Minkowski spacetime is the light cone)

The singular support of the Hadamard propagator Δ H\Delta_H (def. 107) for the Klein-Gordon equation on Minkowski spacetime, regarded via translation invariance as a distribution in a single variable, is the light cone of the origin:

supp sing(Δ H)={x p,1||x| η 2=0}. supp_{sing}(\Delta_H) = \left\{ x \in \mathbb{R}^{p,1} \;\vert\; {\vert x\vert}^2_\eta = 0 \right\} \,.
Proof

By prop. 67 the causal propagator is equivalently the Fourier transform of distributions of the delta distribution of the mass shell times the sign function of the angular frequency; and by basic properties of the Fourier transform this is the convolution of distributions of the separate Fourier transforms:

Δ S(x) δ(η 1(k,k)+(mc) 2)sgn(k 0)^ δ(η 1(k,k)+(mc) 2)^sgn(k 0)^ \begin{aligned} \Delta_S(x) & \propto \widehat{ \delta\left( \eta^{-1}(k,k) + \left( \tfrac{m c}{\hbar}\right)^2 \right) sgn( k_0 ) } \\ &\propto \widehat{\delta\left( \eta^{-1}(k,k) + \left( \tfrac{m c}{\hbar}\right)^2 \right)} \star \widehat{sgn( k_0 )} \end{aligned}

By prop. 63, the singular support of the first convolution factor is the light cone.

The second factor is

Θ(k 0)^1ix 0+0 +δ(k) \widehat{\Theta(k_0)} \propto \tfrac{1}{i x^0 + 0^+} \delta(\vec k)

(by example 97 and example 90 and hence the wave front set (def. 101) of the second factor is

WF(sgn(k 0)^)={(0,k)|kS( p+1)} WF\left(\widehat{sgn(k_0)}\right) = \{(0,k) \;\vert\; k \in S(\mathbb{R}^{p+1})\}

(by example 85 and example 88).

With this the statement follows, via a partition of unity, from prop. 57.

For illustration, we now make this general statement fully explicit in the special case of spacetime dimension

p+1=3+1 p + 1 = 3 + 1

by computing an explicit form for the causal propagator in terms of the delta distribution, the Heaviside distribution and smooth Bessel functions.

We follow (Scharf 95 (2.3.36)).

By (132) we have

Δ H(x,y) =i21(2π) p1ω(k)/csin(ω(k)(x 0y 0)/c)e ik(xy)d pk=Δ S(x,y)+1(2π) p12ω(k)/ccos(ω(k)(x 0y 0)/c)e ik(xy)d pkH(x,y) \begin{aligned} \Delta_H(x,y) & = \tfrac{i}{2} \underset{= \Delta_S(x,y)}{ \underbrace{ \frac{-1}{(2\pi)^p} \int \frac{1}{\omega(\vec k)/c} \sin\left( \omega(\vec k)(x^0 - y^0)/c \right) e^{i \vec k \cdot (\vec x - \vec y) } \, d^p \vec k }} \;+\; \underset{ \coloneqq H(x,y) }{ \underbrace{ \frac{1}{(2\pi)^p} \int \frac{1}{2 \omega(\vec k)/c} \cos\left( \omega(\vec k)(x^0 - y^0)/c \right) e^{i \vec k \cdot (\vec x - \vec y) } \, d^p \vec k }} \end{aligned}

The first summand, proportional to the causal propagator, which we computed as (136) in prop. 72 to be

i2Δ S(x,y)=i4πsgn(x 0)(δ(|xy| η 2)Θ(|xy| η 2)dd(|xy| η 2)J 0(mc|xy| η 2)). \tfrac{i}{2}\Delta_S(x,y) \;=\; \frac{-i}{4\pi} sgn(x^0) \left( \delta\left( -{\vert x-y\vert}^2_\eta \right) \;-\; \Theta\left( -{\vert x-y\vert}^2_\eta \right) \frac{d}{d \left({-\vert x-y\vert}^2_\eta\right) } J_0\left( \tfrac{m c}{\hbar} \sqrt{ -{\vert x-y\vert}^2_\eta } \right) \right) \,.

The second term is computed in a directly analogous fashion: The integrals I ±I_\pm from (134) are now

I ± cos(mc((x 0y 0)cosh(z)±|xy|sinh(z)))dz I_\pm \coloneqq \int_{-\infty}^\infty \cos\left( \tfrac{m c}{\hbar} \left( (x^0 - y^0) \cosh(z) \pm {\vert \vec x - \vec y\vert} \sinh(z) \right) \right) \, d z

Parameterizing by rapidity, as in the proof of prop. 72, one finds that for timelike xyx-y this is

I ± = cos(mc|xy| η 2(cosh(z)))dz =πN 0(mc|xy| η 2) \begin{aligned} I_\pm & = \int_{-\infty}^\infty \cos\left( \tfrac{m c}{\hbar} \sqrt{ {\vert x-y\vert}^2_\eta } \left( \cosh\left( z \right) \right) \right) \, d z \\ & = - \pi N_0 \left( \tfrac{m c}{\hbar} \sqrt{ {\vert x-y\vert}^2_\eta } \right) \end{aligned}

while for spacelike xyx-y it is

I ± = cos(mc|xy| η 2(sinh(z)))dz =2K 0(mc|xy| η 2), \begin{aligned} I_\pm & = \int_{-\infty}^\infty \cos\left( \tfrac{m c}{\hbar} \sqrt{ {\vert x-y\vert}^2_\eta } \left( \sinh\left( z \right) \right) \right) \, d z \\ & = 2 K_0 \left( \tfrac{m c}{\hbar} \sqrt{ {\vert x-y\vert}^2_\eta } \right) \,, \end{aligned}

where we identified the integral representations of the Neumann function N 0N_0 (see here) and of the modified Bessel function K 0K_0 (see here).

As for the Bessel function J 0J_0 in (135) the key point is that these are smooth functions. Hence we conclude that

H(x,y)dd(|xy| η 2)(Θ(|xy| η 2)N 0(mc|xy| η 2)+Θ(|xy| η 2)2πK 0(mc|xy| η 2)). H(x,y) \;\propto\; \frac{d}{d \left( {\vert x-y\vert}^2_\eta \right)} \left( -\Theta\left( -{\vert x-y\vert}^2_\eta \right) N_0 \left( \tfrac{m c}{\hbar} \sqrt{ {\vert x-y\vert}^2_\eta } \right) + \Theta\left( {\vert x-y\vert}^2_\eta \right) \tfrac{2}{\pi} K_0 \left( \tfrac{m c}{\hbar} \sqrt{ {\vert x-y\vert}^2_\eta } \right) \right) \,.

This expression has singularities on the light cone due to the step functions. In fact the expression being differentiated is continuous at the light cone (Scharf 95 (2.3.34)), so that the singularity on the light cone is not a delta distribution singularity from the derivative of the step functions. Accordingly it does not cancel the singularity of i2Δ S(x,y)\tfrac{i}{2}\Delta_S(x,y) as above, and hence the singular support of Δ H\Delta_H is still the whole light cone.

Proposition

(singular support of Feynman propagator for Klein-Gordon equation on Minkowski spacetime)

The singular support of the Feynman propagator Δ H\Delta_H and of the anti-Feynman propagator Δ F¯\Delta_{\overline{F}} (def. 107) for the Klein-Gordon equation on Minkowski spacetime, regarded via translation invariance as a distribution in a single variable, is the light cone of the origin:

supp sing(Δ F) supp sing(Δ F¯)}={x p,1||x| η 2=0}. \left. \array{ supp_{sing}(\Delta_F) \\ supp_{sing}(\Delta_{\overline{F}}) } \right\} = \left\{ x \in \mathbb{R}^{p,1} \;\vert\; {\vert x\vert}^2_\eta = 0 \right\} \,.

(e.g DeWitt 03 (27.85))

Proof

By prop. 71 the Feynman propagator is equivalently the Cauchy principal value of the inverse of the Fourier transformed Klein-Gordon operator:

Δ F1k μk μ(mc) 2+i0 +^. \Delta_F \;\propto\; \widehat{ \frac{1}{-k_\mu k^\mu - \left(\tfrac{m c}{\hbar}\right)^2 + i 0^+} } \,.

With this, the statement follows immediately from prop. 62.

Proposition

(wave front sets of propagators of Klein-Gordon equation on Minkowski spacetime)

The wave front set of the various propagators for the Klein-Gordon equation on Minkowski spacetime, regarded, via translation invariance, as distributions in a single variable, are as follows:

  • the causal propagator Δ S\Delta_S (prop. 65) has wave front set all pairs (x,k)(x,k) with xx and kk both on the lightcone:
WF(Δ S)={(x,k)||x| η 2=0and|k| η 2=0andk0} WF(\Delta_S) = \left\{ (x,k) \,\vert\, {\vert x\vert}^2_\eta = 0 \;\text{and} \; {\vert k\vert}^2_\eta = 0 \; \text{and} \, k \neq 0 \right\}

-
  • the Hadamard propagator Δ H\Delta_H (def. 107) has wave front set all pairs (x,k)(x,k) with xx and kk both on the light cone and k 0>0k^0 \gt 0:
WF(Δ H)={(x,k)||x| η 2=0and|k| η 2=0andk 0>0} WF(\Delta_H) = \left\{ (x,k) \,\vert\, {\vert x\vert}^2_\eta = 0 \;\text{and} \; {\vert k\vert}^2_\eta = 0 \; \text{and} \; k^0 \gt 0 \right\}
  • the Feynman propagator Δ S\Delta_S (def. 108) has wave front set all pairs (x,k)(x,k) with xx and kk both on the light cone and ±k 0>0±x 0>0\pm k_0 \gt 0 \;\Leftrightarrow\; \pm x^0 \gt 0
WF(Δ H)={(x,k)||x| η 2=0and|k| η 2=0and(±k 0>0±x 0>0)} WF(\Delta_H) = \left\{ (x,k) \,\vert\, {\vert x\vert}^2_\eta = 0 \;\text{and} \; {\vert k\vert}^2_\eta = 0 \; \text{and} \; \left( \pm k_0 \gt 0 \;\Leftrightarrow\; \pm x^0 \gt 0 \right) \right\}

(Radzikowski 96, (16))

Proof

First regarding the causal propagator:

By prop. 72 the singular support of Δ S\Delta_S is the light cone.

Since the causal propagator is a solution to the homogeneous Klein-Gordon equation, the propagation of singularities theorem (prop. 59) says that also all wave vectors in the wave front set are lightlike. Hence it just remains to show that all non-vanishing lightlike wave vectors based on the lightcone in spacetime indeed do appear in the wave front set.

To that end, let bC cp ( p,1)b \in C^\infty_{cp}(\mathbb{R}^{p,1}) be a bump function whose compact support includes the origin.

For a p,1a \in \mathbb{R}^{p,1} a point on the light cone, we need to determine the decay property of the Fourier transform of xb(xa)Δ S(x)x \mapsto b(x-a)\Delta_S(x). This is the convolution of distributions of b^(k)e ik μa μ\hat b(k)e^{i k_\mu a^\mu} with Δ^ S(k)\widehat \Delta_S(k). By prop. 67 we have

Δ^ S(k)δ(k μk μ(mc) 2)sgn(k 0). \widehat \Delta_{S}(k) \;\propto\; \delta\left( -k_\mu k^\mu - \left( \tfrac{m c}{\hbar} \right)^2 \right) sgn(k_0) \,.

This means that the convolution product is the smearing of the mass shell by b^(k)e ik μa μ\widehat b(k)e^{i k_\mu a^\mu}.

Since the mass shell asymptotes to the light cone, and since e ik μa μ=1e^{i k_\mu a^\mu} = 1 for kk on the light cone (given that aa is on the light cone), this implies the claim.

Now for the Hadamard propagator:

By def. 107 its Fourier transform is of the form

Δ^ H(k)δ(k μk μ+m 2)Θ(k 0) \widehat \Delta_H(k) \;\propto\; \delta\left( k_\mu k^\mu + m^2 \right) \Theta( -k_0 )

Moreover, its singular support is also the light cone (prop. 73).

Therefore now same argument as before says that the wave front set consists of wave vectors kk on the light cone, but now due to the step function factor Θ(k 0)\Theta(-k_0) it must satisfy 0k 0=k 00 \leq - k_0 = k^0.

Finally regarding the Feynman propagator:

by prop. 70 the Feynman propagator coincides with the positive frequency Hadamard propagator for x 0>0x^0 \gt 0 and with the “negative frequency Hadamard operator” for x 0<0x^0 \lt 0. Therefore the form of WF(Δ F)WF(\Delta_F) now follows directly with that of WF(Δ H)WF(\Delta_H) above.

\,

propagators for the Dirac equation on Minkowski spacetime

We now discuss how the propagators for the free Dirac field on Minkowski spacetime (example 64) follow directly from those for the scalar field discussed above.

Proposition

(advanced and retarded propagator for Dirac equation on Minkowski spacetime)

Consider the Dirac operator on Minkowski spacetime, which in Feynman slash notation reads

D i/+mc =iγ μx μ+mc. \begin{aligned} D & \coloneqq -i {\partial\!\!\!/\,} + \tfrac{m c}{\hbar} \\ & = -i \gamma^\mu \frac{\partial}{\partial x^\mu} + \tfrac{m c}{\hbar} \end{aligned} \,.

Its advanced and retarded propagators (def. 78) are the derivatives of distributions of the advanced and retarded propagators Δ ±\Delta_\pm for the Klein-Gordon equation (prop. 64) by /+m{\partial\!\!\!/\,} + m:

Δ D,±=(i/mc)Δ ±. \Delta_{D, \pm} \;=\; \left( -i{\partial\!\!\!/\,} - \tfrac{m c}{\hbar} \right) \Delta_{\pm} \,.

Hence the same is true for the causal propagator:

Δ D,S=(i/mc)Δ S. \Delta_{D, S} \;=\; \left( -i{\partial\!\!\!/\,} - \tfrac{m c}{\hbar} \right) \Delta_{S} \,.
Proof

Applying a differential operator does not change the support of a smooth function, hence also not the support of a distribution. Therefore the uniqueness of the advanced and retarded propagators (prop. 41) together with the translation-invariance and the anti-formally self-adjointness of the Dirac operator (as for the Klein-Gordon operator (120) implies that it is sufficent to check that applying the Dirac operator to the Δ D,±\Delta_{D, \pm} yields the delta distribution. This follows since the Dirac operator squares to the Klein-Gordon operator:

(i/+mc)Δ D,± =(i/+mc)(i/mc)=(mc) 2Δ ± =δ. \begin{aligned} \left( -i{\partial\!\!\!/\,} + \tfrac{m c}{\hbar} \right) \Delta_{D, \pm} & = \underset{ = \Box - \left(\tfrac{m c}{\hbar}\right)^2}{ \underbrace{ \left( -i{\partial\!\!\!/\,} + \tfrac{m c}{\hbar} \right) \left( -i{\partial\!\!\!/\,} - \tfrac{m c}{\hbar} \right) } } \Delta_{\pm} \\ & = \delta \end{aligned} \,.

Similarly we obtain the other propagators for the Dirac field from those of the real scalar field:

Definition

(Hadamard propagator for Dirac operator on Minkowski spacetime)

The Hadamard propagator for the Dirac operator on Minkowski spacetime is the positive frequency part of the causal propagator (prop. 76), hence the derivative of distributions of the Hadamard propagator for the Klein-Gordon field (def. 107) by the Dirac operator:

(i/+mc)Δ H(x,y) =1(2π) pδ(k μk μ+m 2)Θ(k 0)(k/+mc)e ik μ(x μy μ)d p+1k =1(2π) pγ 0ω(k)/c+γk+mc2ω(k)/ce iω(k)(x 0y 0)/c+ik(xy)d pk. \begin{aligned} \left( -i{\partial\!\!\!/\,} + \tfrac{m c}{\hbar} \right)\Delta_{H}(x,y) & = \frac{1}{(2\pi)^p} \int \delta\left( k_\mu k^\mu + m^2 \right) \Theta( -k_0 ) ( {k\!\!\!/\,} + \tfrac{m c}{\hbar}) e^{i k_\mu (x^\mu-y^\mu) } \, d^{p+1} k \\ & = \frac{1}{(2\pi)^p} \int \frac{ \gamma^0 \omega(\vec k)/c + \vec \gamma \cdot \vec k + \tfrac{m c}{\hbar} }{2 \omega(\vec k)/c} e^{-i \omega(\vec k)(x^0 - y^0)/c + i \vec k \cdot (\vec x - \vec y) } \, d^p \vec k \,. \end{aligned}

Here we used the expression (eq:StandardHadamardDistributionOnMinkowskiSpacetime) for the Hadamard propagator of the Klein-Gordon equation.

Definition

(Feynman propagator for Dirac operator on Minkowski spacetime)

The Feynman propagator for the Dirac operator on Minkowski spacetime is the linear combination

Δ D,FΔ D,H+iΔ D, \Delta_{D, F} \;\coloneqq\; \Delta_{D,H} + i \Delta_{D, -}

of the Hadamard propagator (def. 109) and the retarded propagator (prop. 76). By prop. 71 this means that it is the derivative of distributions of the Feynman propagator of the Klein-Gordon equation (def. 108) by the Dirac operator

Δ D,F =(i/+mc)Δ F(x,y) =limϵ(0,)ϵ0i(2π) p+1 (k/+mc)e ik μ(x μy μ)k μk μ(mc) 2+iϵdk 0d pk. \begin{aligned} \Delta_{D, F} & = \left( -i{\partial\!\!\!/\,} + \tfrac{m c}{\hbar} \right)\Delta_{F}(x,y) & = \underset{ {\epsilon \in (0,\infty)} \atop {\epsilon \to 0} }{\lim} \frac{-i}{(2\pi)^{p+1}} \int \int_{-\infty}^\infty \frac{ \left( {k\!\!\!/\,} + \tfrac{m c}{\hbar} \right) e^{i k_\mu (x^\mu - y^\mu)} }{ - k_\mu k^\mu - \left( \tfrac{m c}{\hbar} \right)^2 + i \epsilon } \, d k_0 \, d^p \vec k \,. \end{aligned}

\,

\,

This concludes our discussion of propagators induced from the covariant phase space of Green hyperbolic free Lagrangian field theory. These propagators will be the key in for quantization via causal perturbation theory. But not all free field theories have a covariant phase space of Green hyperbolic equations of motion, for instance the electromagnetic field, a priori, does not. Therefore before turning to quantization in the next chapter we first discuss how gauge symmetries obstruct the existence of Green hyperbolic equations of motion.

\,

Gauge symmetries

The existence of the covariant phase space (prop. 46) of a Lagrangian field theory requires the existence of Cauchy surfaces (def. 87) for its Euler-Lagrange equations of motion This the case of free field theory (def. 62) this means that the equations of motion are Green hyperbolic (def. 79).

We have seen that this is the case for instance for the scalar field (example 63) and the Dirac field (example 64), but it is not the case generally, for instance it fails for the electromagnetic field (example 46), the Yang-Mills field (example 41) and the B-field (example 42). An obstruction to the existence of the covariant phase space turns out to be (prop. 77 below) the presence of infinitesimal symmetries of the Lagrangian (def. 66) that have compact spacetime support (def. 81).

An class of examples of such are those infinitesimal symmetries of the Lagrangian which occur linearly parameterized by arbitrary sections (and their derivatives) of some vector bundle on spacetime. Because then for every choice of section of compact support the corresponding symmetry will have compact spacetime support. These parameterized infinitesimal symmetries of the Lagrangian are called infinitesimal gauge symmetries, and their parameters we call the gauge parameters (def. 111 below).

Typically all compactly supported infinitesimal symmetries of the Lagrangian arise from parameterized symmetries this way; this is notably the case for the Lagrangian density of the electromagnetic field (example 92) and more generally of the Yang-Mills field.

Therefore the presence of infinitesimal symmetries of the Lagrangian with compact spacetime support is a defect of the theory which however implies its own solution, by indicating which relations ought to be promoted to “gaugeequivalences.

This obstruction is neatly captured by the cochain cohomology of the local BV-complex (def. 85) of the Lagrangian field theory (prop. 83 below). This may be understood as the algebra of functions on an extension of the jet bundle from a (locally pro-finite dimensional, prop. 112) smooth manifold to a differential graded manifold. This appearance of homotopy theory in the guise of homological algebra in Lagrangian field theory paves the way to understanding the cause of the obstruction: It disappears when the field bundle (or more generally its jet bundle) is promoted to its infinitesimal homotopy quotient by the action of these compactly supported symmetries (the “action Lie algebroid”, def. 115 below).

Passing to this homotopy quotient means to hard-wire into the geometry of the types of field their equivalence under these symmetries: in physics this is called gauge equivalence. The result is called the “reduced phase space”, which we turn to further below.

\,

We now discuss these topics

\,

As an immediate corollary of prop. 34 we have the following important observation:

Proposition

(spacetime-compactly supported and on-shell non-trivial infinitesimal symmetries of the Lagrangian obstruc the covariant phase space)

Let (E,L)(E,\mathbf{L}) be a Lagrangian field theory over a Lorentzian spacetime.

If there exists a single infinitesimal symmetry of the Lagrangian vv (def. 66) such that

  1. it has compact spacetime support (def. 81)

  2. it does not vanish on-shell (49) (so not a trivial one, example 91)

then there does not exist any Cauchy surface (def. 87) for the Euler-Lagrange equations of motion (def. 61) outside the spacetime support of vv.

Proof

By prop. 34 the flow along v^\hat v preserves the on-shell space of field histories. Now by the assumption that v^\hat v does not vanish on-shell implies that this flow is non-trivial, hence that it does continuously change the field histories over some points of spacetime, while the assumption that it has compact spacetime support means that these changes are confined to a compact subset of spacetime.

This means that there is a continuum of solutions to the equations of motion whose restriction to the infinitesimal neighbourhood of any codimension-1 suface Σ pΣ\Sigma_p \hookrightarrow \Sigma outside of this compact support coincides. Therefore this restriction map is not an isomorphism and Σ p\Sigma_p not a Cauchy surface for the equations of motion.

Notice that there always exist spacetime-compactly supported infinitesimal symmetries that however do vanish on-shell:

Example

(trivial implicit infinitesimal gauge symmetries)

Let (E,L)(E,\mathbf{L}) be a Lagrangian field theory (def. 60) over Minkowski spacetime (def. 23), so that the Lagrangian density is canonically of the form

L=Ldvol Σ \mathbf{L} = L \, dvol_\Sigma

with Lagrangian function LΩ Σ 0,0(E)=C (J Σ (E))L \in \Omega^{0,0}_\Sigma(E) = C^\infty(J^\infty_\Sigma(E)) a smooth function of the jet bundle (characterized by prop. 19).

Then every evolutionary vector field (def. 64) whose coefficients which is proportional to the Euler-Lagrange derivative (47) of the Lagrangian function LL

vδ ELLδϕ aκ [ab] ϕ aΓ E ev(T ΣE) v \; \coloneqq \; \frac{\delta_{EL} L }{\delta \phi^a} \kappa^{[a b]} \, \partial_{\phi^a} \;\in\; \Gamma_E^{ev}( T_\Sigma E )

by smooth coefficient functions κ ab\kappa^{a b}

κ [ab]Ω Σ 0,0(E) \kappa^{[a b]} \;\in\; \Omega^{0,0}_\Sigma(E)

such that

  1. each κ ab\kappa^{a b} has compact spacetime support (def. 81)

  2. κ\kappa is skew-symmetric in its indices: κ [ab]=κ [ba]\kappa^{[a b]} = - \kappa^{[b a]}

is an implicit infinitesimal gauge symmetry (def. \ref{ImplicitInfinitesimalGaugeSymmetry}).

This is so for a “trivial reason” namely due to that that skew symmetry:

v^L =ι v^δL =ι v^(δ ELLdΘ BFV) =ι ϵδ ELLδϕ aδϕ a+dι v^Θ BFV =(δ ELLδϕ a)(δ ELLδϕ b)κ [ab]=0dvol Σ+dι v^Θ BFV =dι v^Θ BFV \begin{aligned} \mathcal{L}_{\hat v} \mathbf{L} & = \iota_{\hat v} \delta \mathbf{L} \\ &= \iota_{\hat v} ( \delta_{EL}\mathbf{L} - d \Theta_{BFV} ) \\ & = \iota_\epsilon \frac{\delta_{EL}L}{\delta \phi^a} \delta \phi^a + d \iota_{\hat v}\Theta_{BFV} \\ & = \underset{= 0}{ \underbrace{ \left( \frac{\delta_{EL} L }{\delta \phi^a} \right) \left( \frac{\delta_{EL} L }{\delta \phi^b} \right) \kappa^{[a b]} } } \, dvol_\Sigma \;+\; d \iota_{\hat v} \Theta_{BFV} \\ & = d \iota_{\hat v} \Theta_{BFV} \end{aligned}

Here the first steps are just recalling those in the proof of Noether's theorem I (prop. 30) while the last step follows with the skew-symmetry of κ\kappa.

Notice that this means that

  1. the Noether current (73) vanishes: J v^=0J_{\hat v} = 0;

  2. the infinitesimal symmetry vanishes on-shell (41): v^| =0\hat v \vert_{\mathcal{E}} = 0.

Therefore these implicit infinitesimal gauge symmetries are called the trivial infinitesimal gauge transformations.

(e.g. Henneaux 90, section 2.5)

\,

infinitesimal gauge symmetries

Prop. 77 says that the problem is to identify the presence of spacetime-compactly supported infinitesimal symmetries that are on-shell non-trivial.

One way they may be identified is if infinitesimal symmetries appear in linearly parameterized collections, where the parameter itself is an arbitrary spacetime-dependent section of some fiber bundle (hence is itself like a field history), because then choosing the parameter to have compact support yields an infinitesimal gauge symmetry. In this case we speak of a gauge parameter (def. 111 below). It turns out that in most examples of Lagrangian field theories of interest, the infinitesimal gauge symmetries all come from gauge parameters this way, and often “gauge symmetry” is undertood by default to refer to this case. Therefore we now consider this case in detail.

Definition

(infinitesimal gauge symmetries)

Let (E,L)(E,\mathbf{L}) be a Lagrangian field theory (def. 60).

Then a collection of infinitesimal gauge symmetries of (E,L)(E,\mathbf{L}) is

  1. a vector bundle 𝒢gbΣ\mathcal{G} \overset{gb}{\longrightarrow} \Sigma over spacetime Σ\Sigma of positive rank, to be called a gauge parameter bundle;

  2. a bundle morphism (def. 4) RR from the jet bundle of the fiber product 𝒢× ΣE\mathcal{G} \times_\Sigma E with the field bundle (def. 54) to the vertical tangent bundle of EE (def. 6):

    J Σ (𝒢× ΣE) R T ΣE i T Σ(𝒢× ΣE) E \array{ J^\infty_\Sigma( \mathcal{G} \times_\Sigma E ) && \overset{R}{\longrightarrow} && T_\Sigma E & \overset{i}{\hookrightarrow} & T_\Sigma (\mathcal{G} \times_\Sigma E) \\ & \searrow && \swarrow \\ && E }

such that

  1. RR is linear in the first argument (in the gauge parameter);

  2. iRi \circ R is an evolutionary vector field on 𝒢× ΣE\mathcal{G} \times_\Sigma E (def. 64);

  3. RR is an infinitesimal symmetry of the Lagrangian (def. 66) in the second argument.

We may express this equivalently in components in the case that the field bundle EE is a trivial vector bundle with field fiber coordinates (ϕ a)(\phi^a) (example 9) and also 𝒢\mathcal{G} happens to be a trivial vector bundle

𝒢=Σ×𝔤 \mathcal{G} = \Sigma \times \mathfrak{g}

where 𝔤\mathfrak{g} is a vector space with coordinate functions {c α}\{c^\alpha\}.

Then RR may be expanded in the form

(137)R=(c αR α a+c ,μ αR α aμ+c ,μ 1μ 2 αR α aμ 1μ 2+) ϕ a, R \;=\; \left( c^\alpha R^a_\alpha + c^\alpha_{,\mu} R^{a \mu}_\alpha + c^\alpha_{,\mu_1 \mu_2} R^{a \mu_1 \mu_2}_\alpha + \cdots \right) \partial_{\phi^a} \,,

where the components

R α aμ 1μ k=R α aμ 1μ k((ϕ b),(ϕ ,μ b),)Ω Σ 0,0(E)=C (J Σ (E)) R^{a \mu_1 \cdots \mu_k}_\alpha = R^{a \mu_1 \cdots \mu_k}_\alpha\left( (\phi^b), (\phi^b_{,\mu}), \cdots \right) \;\in\; \Omega^{0,0}_\Sigma(E) = C^\infty(J^\infty_\Sigma(E))

are smooth functions on the jet bundle of EE, locally of finite order (prop. 19), and such that the Lie derivative of the Lagrangian density along R(e)R(e) is a total spacetime derivative, which by Noether's theorem I (prop. 30) mean in components that

(c αR α a+c ,μ αR α aμ+c ,μ 1μ 2 αR α aμ 1μ 2+)δ ELLδϕ a=ddx μJ R(e) μ. \left( c^\alpha R^a_\alpha + c^\alpha_{,\mu} R^{a \mu}_\alpha + c^\alpha_{,\mu_1 \mu_2} R^{a \mu_1 \mu_2}_\alpha + \cdots \right) \frac{\delta_{EL} \mathbf{L}}{\delta \phi^a} \;=\; \frac{d}{d x^\mu} J^\mu_{R(e)} \,.

The point is that infinitesimal gauge symmetries in particular yield spacetime-compactly supported infinitesimal gauge symmetries:

Remark

(infinitesimal gauge symmetries yield spacetime-compactly supported infinitesimal symmetries of the Lagrangian)

Let (E,L)(E,\mathbf{L}) be a Lagrangian field theory (def. 60) and 𝒢gbΣ\mathcal{G} \overset{gb}{\to} \Sigma a bundle of gauge parameters for it (def. 111) with gauge parametrization

J Σ (𝒢× ΣE)RT ΣE. J^\infty_\Sigma(\mathcal{G} \times_\Sigma E) \overset{R}{\longrightarrow} T_\Sigma E \,.

Then for every smooth section ϵΓ Σ(𝒢)\epsilon \in \Gamma_\Sigma(\mathcal{G}) of the gauge parameter bundle (def. 5) there is an induced infinitesimal symmetry of the Lagrangian (def. 66) given by the composition of RR with the jet prolongation of ϵ\epsilon (def. 55)

R(ϵ):J Σ (E)=Σ ×ΣJ Σ (E)(j Σ (eps),id)J Σ (𝒢× ΣE)RT ΣE. R(\epsilon) \;\colon\; J^\infty_\Sigma(E) = \Sigma_ \times_\Sigma J^\infty_\Sigma(E) \overset{(j^\infty_\Sigma(eps),id)}{\longrightarrow} J^\infty_\Sigma(\mathcal{G} \times_\Sigma E) \overset{R}{\longrightarrow} T_\Sigma E \,.

In the components (137) this means that

R(ϵ)=(ϵ αR α a+ 2ϵ αx μR α aμ+ϵ αx μx νR α aμ 1μ 2+), R(\epsilon) \;=\; \left( \epsilon^\alpha R^a_\alpha + \frac{\partial^2 \epsilon^\alpha}{\partial x^\mu} R^{a \mu}_\alpha + \frac{\partial \epsilon^\alpha}{\partial x^\mu \partial x^\nu} R^{a \mu_1 \mu_2}_\alpha + \cdots \right) \,,

where now kϵ αx μ 1x μ k((x μ))\frac{\partial^k \epsilon^\alpha}{\partial x^{\mu_1} \cdots \partial x^{\mu_k}}((x^\mu)) are the actual spacetime partial derivatives of the gauge parameter section.

In particular, since 𝒢gbΣ\mathcal{G} \overset{gb}{\to} \Sigma is assumed to be a vector bundle, there always exists gauge parameter sections ϵ\epsilon that have compact support (bump functions). For such compactly supported ϵ\epsilon the infinitesimal symmetry R(ϵ)R(\epsilon) is spacetime-compactly supported as in prop. 77.

The following is a way to identify infinitesimal gauge symmetries:

Proposition

(Noether's theorem IINoether identities)

Let (E,L)(E,\mathbf{L}) be a Lagrangian field theory (def. 60) and let mathalGgbΣ\mathal{G} \overset{gb}{\to} \Sigma be a vector bundle.

The a bundle morphism of the form

J Σ (𝒢× ΣE)RT ΣE J^\infty_\Sigma(\mathcal{G} \times_\Sigma E) \overset{R}{\longrightarrow} T_\Sigma E

a collection of infinitesimal gauge symmetries (def. 111) with local components (137)

R=(c αR α a+c ,μ αR α aμ+c ,μ 1μ 2 αR α aμ 1μ 2+) ϕ a R \;=\; \left( c^\alpha R^a_\alpha + c^\alpha_{,\mu} R^{a \mu}_\alpha + c^\alpha_{,\mu_1 \mu_2} R^{a \mu_1 \mu_2}_\alpha + \cdots \right) \partial_{\phi^a}

precisely if the Euler-Lagrange form δ ELL\delta_{EL}\mathbf{L} (prop. 22) satisfies the following condition:

(R α aδ ELLδϕ aR α aμddx μδ ELLδϕ a+R α aμ 1μ 2d 2dx μ 1dx μ 2δ ELLδϕ a)=0. \left( R^{a}_\alpha \frac{\delta_{EL}\mathbf{L}}{\delta \phi^a} - R^{a \mu}_\alpha \frac{d}{d x^\mu} \frac{\delta_{EL}\mathbf{L}}{\delta \phi^a} + R^{a \mu_1 \mu_2}_\alpha \frac{d^2}{d x^{\mu_1} d x^{\mu_2}} \frac{\delta_{EL}\mathbf{L}}{\delta \phi^a} - \cdots \right) \;=\; 0 \,.

These relations are called the Noether identities of the Euler-Lagrange equations of motion (def 61).

Proof

By Noether's theorem I, RR is an infinitesimal symmetry of the Lagrangian precisely if the contraction (def. 13) of RR with the Euler-Lagrange form (prop. 22) is horizontally exact:

ι Rδ ELL=dJ R^. \iota_{R} \delta_{EL}\mathbf{L} = d J_{\hat R} \,.

From (137) this means that

dJ R^ =ι Rδ ELL =kc μ 1μ k αR α aμ 1μ kδ ELLδϕ a =c αk(1) kR α aμ 1μ kd kdx μ 1dx μ kδ ELLδϕ aA+d(...), \begin{aligned} d J_{\hat R} & = \iota_{R} \delta_{EL} \mathbf{L} \\ & = \underset{k \in \mathbb{N}}{\sum} c^\alpha_{\mu_1 \cdots \mu_k} R^{a \mu_1 \cdots \mu_k}_\alpha \frac{\delta_{EL} \mathbf{L}}{\delta \phi^a} \\ & = \underset{A}{ \underbrace{ c^\alpha \underset{k \in \mathbb{N}}{\sum} (-1)^k R^{a \mu_1 \cdots \mu_k}_\alpha \frac{d^k}{d x^{\mu_1} \cdots d x^{\mu_k}} \frac{\delta_{EL} \mathbf{L}}{\delta \phi^a} } } + d (...) \,, \end{aligned}

where in the last step we used jet-level integration by parts to move the total spacetime derivatives off of c αc^\alpha, thereby picking up some horizontally exact correction term, as show.

This means that the term AA over the brace is horizontally exact:

(138)c αk(1) kR α aμ 1μ kd kdx μ 1dx μ kδ ELLδϕ a=d(...) c^\alpha \underset{k \in \mathbb{N}}{\sum} (-1)^k R^{a \mu_1 \cdots \mu_k}_\alpha \frac{d^k}{d x^{\mu_1} \cdots d x^{\mu_k}} \frac{\delta_{EL} \mathbf{L}}{\delta \phi^a} \;=\; d(...)

But now the term on the left is independent of the jet coordinates ϵ ,μ 1μ k α\epsilon^\alpha_{,\mu_1 \cdots \mu_k} of positive order k1k \geq 1, while the horizontal derivative increases the dependency on the jet order by one. Therefore the term on the left is horizontally exact precisely if it vanishes, which is the case precisely if the coefficients of c αc^\alpha vanish, which is the statement of the Noether identities.

Alternatively we may reach this conclusion from (138) by applying to both sides of (138) the Euler-Lagrange derivative (47) with respect to c αc^\alpha. On the left this yields again the coefficients of c αc^\alpha, while by the argument from example 50 it makes the right hand side vanish.

\,

Example

(infinitesimal gauge symmetry of electromagnetic field)

Consider the Lagrangian field theory (E,L)(E,\mathbf{L}) of free electromagnetism on Minkowski spacetime Σ\Sigma from example 40. With field coordinates denoted ((x μ),(a μ))((x^\mu), (a_\mu)) the Lagrangian density is

L=12f μνf μνdvol Σ, \mathbf{L} \;=\; \tfrac{1}{2} f_{\mu \nu} f^{\mu \nu} \, dvol_\Sigma \,,

where f μνa ν,μf_{\mu \nu} \coloneqq a_{\nu,\mu} is the universal Faraday tensor from example 36.

Let 𝒢Σ×\mathcal{G} \coloneqq \Sigma \times \mathbb{R} be the trivial line bundle, regarded as a gauge parameter bundle (def. 23) with coordinate functions ((x μ),c)((x^\mu), c).

Then a gauge parametrized evolutionary vector field (137) is given by

R=c ,μ a μ R \;=\; c_{,\mu} \partial_{a_\mu}

with prolongation (prop. 28)

(139)R^=c ,μ a μ+c ,μν a μ,ν+. \widehat R \;=\; c_{,\mu} \partial_{a_\mu} + c_{,\mu \nu} \partial_{a_{\mu,\nu}} + \cdots \,.

This is because already the universal Faraday tensor is invariant under this flow:

Rf μν =12c ,μν a μ,ν(a ν,μa μ,ν) =12(c ,νμc ,μν) =0, \begin{aligned} \mathcal{L}_R f_{\mu \nu} &= \tfrac{1}{2} c_{,\mu' \nu'} \partial_{a_{\mu',\nu'}} \left( a_{\nu, \mu} - a_{\mu,\nu} \right) \\ & = \tfrac{1}{2} \left( c_{,\nu\mu} - c_{,\mu \nu} \right) \\ & = 0 \,, \end{aligned}

because partial derivatives commute with each other: c ,μν=c ,νμc_{,\mu \nu} = c_{,\nu \mu} (29).

Equivalently, the Euler-Lagrange form

δ ELL=ddx μf μνδa νdvol Σ \delta_{EL}\mathbf{L} \;=\; \frac{d}{d x^\mu }f^{\mu \nu} \delta a_\nu \, dvol_\Sigma

of the theory (example 46), corresponding to the vacuum Maxwell equations (example 26), satisfies the following Noether identity (prop. 78):

ddx μddx νf μν=0, \frac{d}{d x^\mu} \frac{d}{d x^\nu} f^{\mu \nu} = 0 \,,

again due to the fact that partial derivatives commute with each other.

This is the archetypical infinitesimal gauge symmetry that gives gauge theory its name.

More generally:

Example

(infinitesimal gauge symmetry of Yang-Mills theory)

For 𝔤\mathfrak{g} a semisimple Lie algebra, consider the Lagrangian field theory of Yang-Mills theory on Minkowski spacetime from example 41, with Lagrangian density

L=12f μν αf α μν \mathbf{L} \;=\; \tfrac{1}{2} f^\alpha_{\mu \nu} f_\alpha^{\mu \nu}

given by the universal field strength

f μν α12(a [ν,μ] α+12γ βγ αa [μ βa ν] γ). f^\alpha_{\mu \nu} \;\coloneqq\; \tfrac{1}{2} \left( a^\alpha_{[\nu,\mu]} + \tfrac{1}{2} \gamma^\alpha_{\beta \gamma} a^\beta_{[\mu} a^\gamma_{\nu]} \right) \,.

Let 𝒢Σ×𝔤\mathcal{G} \coloneqq \Sigma \times \mathfrak{g} be the trivial vector bundle with fiber 𝔤\mathfrak{g}, regarded as a gauge parameter bundle (def. 23) with coordinate functions ((x μ),c α)((x^\mu), c^\alpha).

Then a gauge parametrized evolutionary vector field (137) is given by

R=(c ,μ αγ βγ αc βa μ γ) a μ α R \;=\; \left( c^\alpha_{,\mu} - \gamma^\alpha_{\beta \gamma} c^\beta a^\gamma_\mu \right) \partial_{a^\alpha_\mu}

with prolongation (prop. 28)

R^=(c ,μ αγ βγ αc βa μ γ) a μ α+(c ,μν αγ βγ α(c ,ν βa μ γ+c βa μ,ν γ)) a μ,ν α+. \widehat{R} \;=\; \left( c^\alpha_{,\mu} - \gamma^\alpha_{\beta \gamma} c^\beta a^\gamma_\mu \right) \partial_{a^\alpha_\mu} \;+\; \left( c^\alpha_{,\mu \nu} - \gamma^\alpha_{\beta \gamma} \left( c^\beta_{,\nu} a^\gamma_\mu + c^\beta a^\gamma_{\mu,\nu} \right) \right) \partial_{a^\alpha_{\mu,\nu}} \;+\; \cdots \,.

We compute the derivative of the Lagrangian function along this vector field:

R^12f μν αf α μν =(Rf μν α)f α μν =(R(a ν,μ+12γ βγ αa μ βa ν γ))f α μν =(c ,νμ αγ βγ α(c ,μ βa ν γ+c βa ν,μ γ)+γ βγ α(c ,μ βγ βγ βc βa μ γ)a ν γ)f α μν =γ βγ αc β(a ν,μ γ+γ βγ γa μ βa ν γ)=2f μν γf α μν =2γ αβγc αf μν βf γμν =0. \begin{aligned} \widehat{R} \tfrac{1}{2} f^\alpha_{\mu \nu} f_\alpha^{\mu \nu} & = \left( R f^\alpha_{\mu \nu} \right) f_\alpha^{\mu \nu} \\ & = \left( R \left( a_{\nu,\mu} + \tfrac{1}{2}\gamma^\alpha_{\beta \gamma} a^\beta_{\mu} a^\gamma_{\nu} \right) \right) f_\alpha^{\mu \nu} \\ & = \left( c^\alpha_{,\nu \mu} - \gamma^\alpha_{\beta \gamma} \left( c^\beta_{,\mu} a^\gamma_\nu + c^\beta a^\gamma_{\nu,\mu} \right) + \gamma^\alpha_{\beta \gamma} \left( c^\beta_{,\mu} - \gamma^\beta_{\beta' \gamma'} c^{\beta'} a^{\gamma'}_\mu \right) a^\gamma_{\nu} \right) f_\alpha^{\mu \nu} \\ & = - \gamma^{\alpha}_{\beta \gamma} c^\beta \underset{ = 2 f^\gamma_{\mu \nu} }{ \underbrace{ \left( a^\gamma_{\nu,\mu} + \gamma^\gamma_{\beta' \gamma'} a^{\beta'}_\mu a^{\gamma'}_\nu \right) } } f_\alpha{}^{\mu \nu} \\ &= 2 \gamma_{\alpha \beta \gamma} c^\alpha f^\beta_{\mu \nu} f^{\gamma \mu \nu} \\ & = 0 \,. \end{aligned}

Here in the third step we used that c ,νμ α=+c ,μν αc^\alpha_{,\nu \mu} = + c^\alpha_{,\mu \nu} (29), so that its contraction with the skew-symmetric f α μνf_\alpha^{\mu\nu} vanishes, and in the last step we used that for a semisimple Lie algebra γ αβγk ααγ α βγ\gamma_{\alpha \beta \gamma} \coloneqq k_{\alpha \alpha'} \gamma^{\alpha'}{}_{\beta \gamma} is totally skew symmetric.

So the Lagrangian density of Yang-Mills theory is strictly invariant under these infinitesimal gauge symmetries.

\,

Lie algebra action and Lie algebroids

Making the implicit infinitesimal gauge symmetries explicit means to make explicit how they act on the fields. To this end consider the general concept of an action of a Lie algebra by infinitesimal diffeomorphisms:

Definition

(action of Lie algebra by infinitesimal diffeomorphism)

Let XX be a smooth manifold or more generally a locally pro-manifold (prop. 19), and let 𝔤\mathfrak{g} be a Lie algebra.

An action of 𝔤\mathfrak{g} on XX by infinitesimal diffeomorphisms, is a homomorphism of Lie algebras

ρ:𝔤(Vect(X),[,]) \rho \;\colon \mathfrak{g} \longrightarrow ( Vect(X), [-,-] )

to the smooth vector fields on XX.

Equivalently – to bring out the relation to the gauge parameterized implicit infinitesimal gauge transformations in def. 23 – this is a 𝔤\mathfrak{g}-parameterized section

𝔤×X ρ TX pr 2 p X \array{ \mathfrak{g} \times X && \overset{\rho}{\longrightarrow} && T X \\ & {\mathllap{pr_2}}\searrow && \swarrow_{\mathrlap{p}} \\ && X }

of the tangent bundle, such that for all pairs of points e 1,e 2e_1, e_2 in 𝔤\mathfrak{g} we have

[ρ(e 1,),ρ(e 2,)]=ρ([e 1,e 2],) \left[\rho(e_1,-), \rho(e_2,-)\right] = \rho([e_1,e_2],-)

(with the Lie bracket of vector fields on the left).

Definition

(irreducible closed gauge parameters)

Let (E,L)(E,\mathbf{L}) be a Lagrangian field theory (def. 60). Then a collection

J Σ (𝒢× ΣE)RT ΣE J^\infty_\Sigma(\mathcal{G} \times_\Sigma E) \overset{R}{\to} T_\Sigma E

of infinitesimal gauge symmetries (def. 111) is called irreducibly closed if it is closed under the Lie bracket of evolutionary vector fields (prop. 29) in that there is a unique morphism

[,]:J Σ (𝒢)× ΣJ Σ (𝒢)J Σ (𝒢) [-,-] \;\colon\; J^\infty_\Sigma(\mathcal{G}) \times_\Sigma J^\infty_\Sigma(\mathcal{G}) \longrightarrow J^\infty_\Sigma(\mathcal{G})

such that

[R(),R()]=R([,]):J Σ (𝒢)× ΣJ Σ (𝒢)× ΣJ Σ (E)T Σ(E), \left[ R(-) , R(-)\right] \;=\; R([-,-]) \;\colon\; J^\infty_\Sigma(\mathcal{G}) \times_\Sigma J^\infty_\Sigma(\mathcal{G}) \times_\Sigma J^\infty_\Sigma(E) \longrightarrow T_\Sigma(E) \,,

where on the left we have the Lie bracket of eolutionary vector fields from prop. 29.

Example

(action of irreducible closed gauge parameterized implicit infinitesimal gauge symmetries on fields)

Let (E,L)(E,\mathbf{L}) be a Lagrangian field theory (def. 60), and let 𝒢gbΣ\mathcal{G} \overset{gb}{\to} \Sigma be a bundle of irreducible closed gauge parameters for the theory (def. 23) with bundle morphism

J Σ (𝒢× ΣE) R T ΣE E \array{ J^\infty_\Sigma( \mathcal{G} \times_\Sigma E ) && \overset{R}{\longrightarrow} && T_\Sigma E \\ & \searrow && \swarrow \\ && E }

exhibiting the corresponding parameterized implicit infinitesimal gauge symmetries.

By passing from these evolutionary vector fields R(e)R(e) (def. 64) to their prolongations R(e)^\widehat{R(e)}, being actual vector fields on the jet bundle (prop. 28), we obtain a bundle morphism of the form

J Σ (𝒢)× ΣJ Σ E R(e)^ T ΣJ Σ (E) J Σ (E). \array{ J^\infty_\Sigma(\mathcal{G}) \times_\Sigma J^\infty_\Sigma E && \overset{\widehat{R(e)}}{\longrightarrow} && T_\Sigma J^\infty_\Sigma(E) \\ & \searrow && \swarrow \\ && J^\infty_\Sigma(E) } \,.

In the case that 𝒢=𝔤×Σ\mathcal{G} = \mathfrak{g} \times \Sigma is a trivial vector bundle, with fiber 𝔤\mathfrak{g}, then so is its jet bundle

J Σ (𝔤×Σ)=𝔤 ×Σ, J^\infty_\Sigma(\mathfrak{g} \times \Sigma) = \mathfrak{g}^\infty \times \Sigma \,,

and so in this case the above becomes of the form

𝔤 ×J Σ E R(e)^ T ΣJ Σ (E) J Σ (E). \array{ \mathfrak{g}^\infty \times J^\infty_\Sigma E && \overset{\widehat{R(e)}}{\longrightarrow} && T_\Sigma J^\infty_\Sigma(E) \\ & \searrow && \swarrow \\ && J^\infty_\Sigma(E) } \,.

By def. 23 and def. 112 this now exhibits an action

ρ^:𝔤 Γ J Σ (E)(T(J Σ (E))) \widehat{\rho} \;\colon\; \mathfrak{g}^\infty \longrightarrow \Gamma_{J^\infty_\Sigma(E)}\left( T(J^\infty_\Sigma(E)) \right)

of a Lie algebra 𝔤 \mathfrak{g}^\infty on the jet bundle of the field bundle by infinitesimal diffeomorphisms.

We have seen that the presence of non-trivial implicit infinitesimal gauge transformations (def. \ref{ImplicitInfinitesimalGaugeSymmetry}) in a Lagrangian field theory obstructs the existence of the covariant phase space of the theory (prop. 77). But these implicit infinitesimal gauge symmetries become explicit by hard-wiring into the very geometry of the types of fields their equivalence under these symmetries: In physics this is called gauge equivalence.

Mathematically this means to pass to the infinitesimal homotopy quotient of the action of the gauge symmetries on the shell, represented by the action Lie algebroid (def. 115 below). This is called the local reduced phase space of the theory. Such “higher structures” exist in the unification of differential geometry with homotopy theory called higher differential geometry. The (“Chevalley-Eilenberg”-)algebra of functions on this “field bundle with infinitesimal gauge symmetries made explicit” is called the BRST complex. In this cochain complex the formerly implicit infinitesimal gauge symmetries appear explicitly in the guise of field variables of positive (i.e. “higher”) degree in a differential graded-commutative algebra. These are called ghost fields.

\,

Definition

(Lie ∞-algebroid)

Let XX be an infinitesimally thickened point and write C (X)C^\infty(X) for its algebra of functions. Then a connected Lie ∞-algebroid 𝔞\mathfrak{a} over XX of finite type is a

  1. a sequence (𝔞 k) k=1 (\mathfrak{a}_k)_{ k = 1 }^\infty of free modules of finite rank over C (X)C^\infty(X), hence a graded module 𝔞 \mathfrak{a}_\bullet in degrees kk \in \mathbb{N}; k1k \geq 1

  2. a differential d CEd_{CE} that makes the graded-commutative algebra Sym C (X)(𝔞 *)Sym_{C^\infty(X)}(\mathfrak{a}^\ast_\bullet) into a cochain differential graded-commutative algebra (hence with d CEd_{CE} of degree +1) over \mathbb{R} (not necessarily over C (X)C^\infty(X)), to be called the Chevalley-Eilenberg algebra of 𝔞\mathfrak{a}:

    (140)CE(𝔞)(Sym C (X)(𝔞 *),d CE). CE(\mathfrak{a}) \;\coloneqq\; \left( Sym_{C^\infty(X)}(\mathfrak{a}^\ast_\bullet) \,,\, d_{CE} \right) \,.

If we allow 𝔞 \mathfrak{a}_\bullet to also have terms in non-positive degree, then we speak of a derived Lie algebroid. If 𝔞 \mathfrak{a}_\bullet is only concentrated in negative degrees, we also speak of a derived manifold.

With C (X)C^\infty(X) canonically itself regarded as a dgc-algebra, there is a canonical dg-algebra homomorphism

CE(𝔞)C (X) CE(\mathfrak{a}) \longrightarrow C^\infty(X)

which is the identity on C (X)C^\infty(X) and zero on 𝔞 0 *\mathfrak{a}^\ast_{\neq 0}.

Remark

(Lie algebroids as differential graded manifolds)

Definition 114 of derived Lie algebroids is an encoding in higher algebra (homological algebra, in this case) of a situation that is usefully thought of in terms of higher differential geometry.

To see this, recall the magic algebraic properties of ordinary differential geometry (prop. 1)

  1. embedding of smooth manifolds into formal duals of R-algebras;

  2. embedding of smooth vector bundles into formal duals of modules

Together these imply that we may think of the graded algebra underlying a Chevalley-Eilenberg algebra as being the algebra of functions on a graded manifold

×𝔞 2×𝔞 1×X×𝔞 1× \cdots \times \mathfrak{a}_2 \times \mathfrak{a}_1 \times X \times \mathfrak{a}_{-1} \times \cdots

which is infinitesimal in non-vanishing degree.

The “higher” in higher differential geometry refers to the degrees higher than zero. See at Higher Structures for exposition. Specifically if 𝔞 \mathfrak{a}_\bullet has components in negative degrees, these are also called derived manifolds.

Example

(basic examples of Lie algebroids)

Two basic examples of Lie algebroids are:

  1. For XX any smooth manifold, then setting 𝔞 00\mathfrak{a}_{\neq 0 } \coloneqq 0 and d CE0d_{CE} \coloneqq 0 makes it a Lie algebroid. We will just still just write XX for the manifold trivially regarded as a Lie alebroid this way.

  2. For 𝔤\mathfrak{g} a finite dimensional Lie algebra we obtain a Lie algebroid denoted */𝔤\ast/\mathfrak{g} or B𝔤B \mathfrak{g} by taking the base manifold to be the point, taking 𝔞\mathfrak{a} to be concentrated in degree 1 on 𝔤\mathfrak{g}, and taking the differential to be given by the linear dual of the Lie bracket

    d CE:𝔤 *[,] *𝔤 *𝔤 *. d_{CE} \;\colon\; \mathfrak{g}^\ast \overset{[-,-]^\ast}{\longrightarrow} \mathfrak{g}^\ast \wedge \mathfrak{g}^\ast \,.

    If (t α)(t_\alpha) is a linear basis for 𝔤\mathfrak{g} and (t α)(t^\alpha) a corresponding dual basis for 𝔤 *\mathfrak{g}^\ast, then this is given by

    d CEt α12C βγ αt βt γ, d_{CE} t^\alpha -\tfrac{1}{2} C^\alpha_{\beta \gamma} t^\beta \wedge t^\gamma \,,

    where on the right we have the structure constants of the Lie bracket in this basis:

    [t β,t γ]=t αC α βγ. [t^\beta, t^\gamma] = t_\alpha C^\alpha{}_{\beta \gamma} \,.

    The resulting dgc-algebra

    ( 𝔤 *,d CE=[,] *) \left( \wedge^\bullet \mathfrak{g}^\ast, d_{CE} = [-,-]^\ast \right)

    is the standard Chevalley-Eilenberg algebra from basic Lie theory, whence the name of the general concept.

The two basic examples 95 are unified by the concept of action Lie algebroid (def. 115 below), which is the one of central relevance for the discussion of gauge theory: the local BRST complex (def. 97 below).

Definition

(action Lie algebroid)

Given an infinitesimal action of a Lie algebra 𝔤\mathfrak{g} on a manifold XX (def. 112) the action Lie algebroid X/𝔤X/\mathfrak{g} is the Lie algebroid (def. 114) whose underlying space is XX; whose C (X)C^\infty(X)-module is concentrated in degree 1 on the free module C (X) 𝔤C^\infty(X) \otimes_{\mathbb{R}} \mathfrak{g} and whose CE-differential is given

  • on functions fC (X)f \in C^\infty(X) by the Lie algebra action

    d CEfρ()(f)C (X)𝔤 * d_{CE} f \coloneqq \rho(-)(f) \in C^\infty(X) \otimes \mathfrak{g}^\ast
  • on dual Lie algebra elements ω𝔤 *\omega \in \mathfrak{g}^\ast by the linear dual of the Lie bracket

    d CEωω([,])𝔤 *𝔤 *. d_{CE} \omega \coloneqq \omega([-,-]) \;\in \; \mathfrak{g}^\ast \wedge \mathfrak{g}^\ast \,.

In terms of coordinates this means the following. Assume that X= nX = \mathbb{R}^n is a Cartesian space with coordinates (ϕ a)(\phi^a) and let {t α}\{t_\alpha\} be a linear basis for 𝔤\mathfrak{g} with dual basis (c α)(c^\alpha) for 𝔤 *\mathfrak{g}^\ast. Then the Lie action has components

d CEϕ a =ρ α ac α d CEc α =12γ α βγc βc γ \begin{aligned} d_{CE} \phi^a & = \rho^{a}_{\alpha} c^\alpha \\ d_{CE} c^\alpha & = -\tfrac{1}{2} \gamma^\alpha{}_{\beta \gamma} \, c^\beta \wedge c^\gamma \end{aligned}

where on the right we have the structure constants of the Lie algebra in this basis:

[t β,t γ]=γ α βγt α. [t_\beta, t_\gamma] = \gamma^\alpha{}_{\beta \gamma} t_\alpha \,.

That the differential d CEd_{CE} thus defined indeed squares to 0 is

  • in degree 0 the action property: ρ([t,t])=[ρ(t),ρ(t)]\rho([t, t']) = [\rho(t), \rho(t')]

  • in degree 1 the Jacobi identity.

Example

(horizontal tangent Lie algebroid)

Let Σ\Sigma be a smooth manifold or more generally a locally pro-manifold (prop. 19). Then we write Σ/TΣ\Sigma/T\Sigma for the Lie algebroid over XX and whose Chevalley-Eilenberg algebra is generated over C (X)C^\infty(X) in degree 1 from the module

𝔞 1 *(Γ(TΣ)) *Γ(T *Σ)=Ω 1(Σ) \mathfrak{a}_1^\ast \coloneqq (\Gamma(T \Sigma))^\ast \simeq\Gamma(T^\ast \Sigma) = \Omega^1(\Sigma)

of differential 1-forms and whose Chevalley-Eilenberg differential is the de Rham differential, so that the Chevalley-Eilenberg algebra is the de Rham dg-algebra

CE(Σ/TΣ)(Ω (Σ),d dR). CE( \Sigma/T\Sigma ) \coloneqq (\Omega^\bullet(\Sigma), d_{dR}) \,.

This is called the tangent Lie algebroid of Σ\Sigma. As a graded manifold (via remark 24) this is called the “shifted tangent bundleT[1]ΣT[1] \Sigma of XX.

More generally, let EfbΣE \overset{fb}{\to} \Sigma be a fiber bundle over Σ\Sigma. Then there is a Lie algebroid J Σ (E)/TΣJ^\infty_\Sigma(E)/T\Sigma over the jet bundle of EE (def. 54) defined by its Chevalley-Eilenberg algebra being the horizontal part of the variational bicomplex (def. 59):

CE(J Σ (E)/TΣ)(Ω Σ ,0(E),d). CE\left( J^\infty_\Sigma(E)/T\Sigma \right) \;\coloneqq\; \left(\Omega^{\bullet,0}_\Sigma(E), d\right) \,.

The underlying graded manifold of J Σ (E)/TΣJ^\infty_\Sigma(E)/T\Sigma is the fiber product J Σ (E)× ΣT[1]ΣJ^\infty_\Sigma(E)\times_\Sigma T[1]\Sigma of the jet bundle of EE with the shifted tangent bundle of Σ\Sigma.

There is then a canonical homomorphism of Lie algebroids (def. 116)

J Σ (E)/TΣ Σ/TΣ \array{ J^\infty_\Sigma(E)/T\Sigma \\ \downarrow \\ \Sigma/T\Sigma }

\,

local off-shell BRST complex

Example

(local BRST complex and ghost fields for irreducible closed gauge parameters)

Let (E,L)(E,\mathbf{L}) be a Lagrangian field theory (def. 60), and let 𝒢gbΣ\mathcal{G} \overset{gb}{\longrightarrow} \Sigma be a bundle of irreducible closed gauge parameters for the theory (def. 23) with bundle morphism

J Σ (𝒢× ΣE) R T ΣE E. \array{ J^\infty_\Sigma( \mathcal{G} \times_\Sigma E ) && \overset{R}{\longrightarrow} && T_\Sigma E \\ & \searrow && \swarrow \\ && E } \,.

Assuming that the gauge parameter bundle is trivial, 𝒢=𝔤×Σ\mathcal{G} = \mathfrak{g} \times \Sigma, then by example 94 this induces an action R^\hat R of a Lie algebra 𝔤 \mathfrak{g}^\infty on J Σ EJ^\infty_\Sigma E by infinitesimal diffeomorphisms.

The corresponding action Lie algebroid J Σ (E)/𝔤 J^\infty_\Sigma(E)/\mathfrak{g}^\infty (def. 115) has as underlying graded manifold (remark 24)

𝔤 [1]×J Σ (E)J Σ (𝒢[1]× ΣE) \mathfrak{g}^\infty[1] \times J^\infty_\Sigma(E) \;\simeq\; J^\infty_\Sigma( \mathcal{G}[1] \times_\Sigma E )

the jet bundle of the graded field bundle

E BRST𝒢[1]×E E_{BRST} \coloneqq \mathcal{G}[1] \times E

which regards the gauge parameters as fields in degree 1. As such these are called ghost fields:

{ghost field histories}Γ Σ(𝒢[1]). \left\{ \text{ghost field histories} \right\} \;\coloneqq\; \Gamma_\Sigma( \mathcal{G}[1] ) \,.

Therefore we write suggestively

E/𝒢J Σ (E)/𝔤 E/\mathcal{G} \;\coloneqq\; J^\infty_\Sigma(E)/\mathfrak{g}^\infty

for the action Lie algebroid of the gauge parameterized implicit infinitesimal gauge symmetries on the jet bundle of the field bundle.

The Chevalley-Eilenberg differential of the BRST complex is traditionally denoted

s BRSTd CE. s_{BRST} \coloneqq d_{CE} \,.

To express this in coordinates, assume that the field bundle EE as well as the gauge parameter bundle are trivial vector bundles (example 9) with (ϕ a)(\phi^a) the field coordinates on the fiber of EE with induced jet coordinates ((x μ),(ϕ a),(ϕ μ a),)((x^\mu), (\phi^a), (\phi^a_{\mu}), \cdots) and (c α)(c^\alpha) are ghost field coordinates on the fiber of 𝒢[1]\mathcal{G}[1] with induced jet coordinates ((x μ),(c α),(c μ α),)((x^\mu), (c^\alpha), (c^\alpha_\mu), \cdots).

Then in terms of the corresponding coordinate expression for the gauge symmetries RR (137) the BRST differential is given on the fields by

s BRSTϕ a=kR α aμ 1μ kc μ 1μ k α s_{BRST} \, \phi^a \;=\; \underset{k \in \mathbb{N}}{\sum} R^{a \mu_1 \cdots \mu_k}_{\alpha} c^\alpha_{\mu_1 \cdots \mu_k}

and on the ghost fields by

s BRSTc α=12γ α βγc βc γ, s_{BRST} \, c^\alpha = \tfrac{1}{2}\gamma^\alpha{}_{\beta \gamma} c^\beta c^\gamma \,,

and it extends from there, via prop. 28, to jets of fields and ghost fields by (anti-)commutativity with the total spacetime derivative.

Moreover, since the action of the infinitesimal gauge symmetries is by definition via prolongations (prop. 28) of evolutionary vector fields (def. 64) and hence compatible with the total spacetime derivative (69) this construction descends to the horizontal tangent Lie algebroid J Σ (E)/TΣJ^\infty_\Sigma(E)/T\Sigma (example 96) to yield

E/(𝒢× ΣTΣ)(J Σ (E)/TΣ)/𝔤 E/(\mathcal{G}\times_\Sigma T \Sigma) \;\coloneqq\; \left(J^\infty_\Sigma(E)/T\Sigma\right)/\mathfrak{g}^\infty

The Chevalley-Eilenberg differential on E/(𝒢× ΣTΣ)E/(\mathcal{G}\times_\Sigma T \Sigma) is

ds BRST d - s_{BRST}

The Chevalley-Eilenberg algebra of functions on this differential graded manifold (140) is called the off-shell local BRST complex (Barnich-Brandt-Henneaux 94).

We may pass from the local BRST complex on the jet bundle to the “global” BRST complex by transgression of variational differential forms (def. 82):

Write Obs Σ(E×Σ𝒢[1])Obs_\Sigma(E \times\Sigma \mathcal{G}[1]) for the induced graded algebra of observables (def. 83). For AΩ Σ p+1,(E× Σ𝒢[1])A \in \Omega^{p+1,\bullet}_\Sigma(E \times_\Sigma \mathcal{G}[1]) with corresponding local observable τ Σ(A)LocObs Σ(E× Σ𝒢[1])\tau_\Sigma(A) \in LocObs_\Sigma(E \times_\Sigma \mathcal{G}[1]) its BRST differential is defined by

s BRSTτ Σ(A)τ Σ(s BRSTA) s_{BRST} \tau_{\Sigma}(A) \;\coloneqq\; \tau_{\Sigma}(s_{BRST} A)

and extended from there to Obs Σ(E× Σ𝒢[1])Obs_\Sigma(E \times_\Sigma \mathcal{G}[1]) as a graded derivation.

Example

(local BRST complex for free electromagnetic field on Minkowski spacetime)

Consider the Lagrangian field theory of free electromagnetism on Minkowski spacetime (example 40) with its gauge parameter bundle as in example 92.

By (139) the action of the BRST differential is the derivation

s BRST=c ,μa μ+c ,μνa μ,ν+. s_{BRST} \;=\; c_{,\mu} \frac{\partial}{\partial a_\mu} + c_{, \mu \nu} \frac{\partial}{\partial a_{\mu, \nu}} + \cdots \,.

In particular the Lagrangian density is BRST-closed

s BRSTL =s BRSTf μνf μνdvol Σ =c ,μνf μνdvol Σ =0 \begin{aligned} s_{BRST} \mathbf{L} & = s_{BRST} f_{\mu \nu} f^{\mu \nu} dvol_\Sigma \\ & = c_{,\mu \nu} f^{\mu \nu} dvol_\Sigma \\ & = 0 \end{aligned}

as is the Euler-Lagrange form (due to the symmetry c ,μν=c ,νμc_{,\mu \nu} = c_{,\nu \mu} and in contrast to the skew-symmetry f μν=f νμf_{\mu \nu} = - f_{\nu \mu}).

\,

This concludes our discussion of gauge symmetries as such. In the next chapter we discuss the homotopy quotient of the covariant phase space by the gauge symmetries, called the reduced phase space.

\,

Reduced phase space

We have seen above that the covariant phase space of a Lagrangian field theory is, if it exists, the “covariant transgression” of the shell (prop. 22) equipped with the local Poisson bracket (example 36). The local observables (def. 83) which operationally define the classical field theory (or rather prequantum field theory) are functions on the covariant phase space, and the local Poisson bracket on them operationally defines the corresponding quantum field theory (below). Therefore the existence of the covariant phase space is crucial for the construction of the field theory.

However, we have then seen in prop. 77 that there may be an obstruction to the existence of the covariant phase space, namely the presence of infinitesimal gauge symmetries of the Lagrangian which have been “left implicit”. We have then discussed how to make these infinitesimal gauge symmetries “explicit” by hard-wiring their action into the geometry of the fields by passing to the corresponding infinitesimal homotopy quotient (def. 114) of (the jet bundle of) the field bundle, given by the corresponding action Lie algebroid (def. 115). Its Chevalley-Eilenberg algebra of functions is called the local BRST complex of the theory (example 97).

The corresponding covariant phase space with infinitesimal gauge symmetries made explicit is now correspondingly given by the shell (48) not inside the plain space of fields, but inside this homotopy quotient by the infinitesimal gauge symmetries. This homotopy quotient of the naive phase space by the infinitesimal gauge symmetries is called the reduced phase space. Refined to its local incarnation in the jet bundle we may call this the “derived reduced prolonged shell” (def. 120 below). Its algebra of functions is called the local BV-BRST complex of the theory.

In the next section below we find that, at least in good situations, if all non-trivial implicit infinitesimal gauge symmetries have been made explicit this way by hard-wiring their action into the geometry of the reduced phase space, then the obstruction to the existence of the covariant phase space vanishes. Hence in this case then the (perturbative) quantum field theory can exist (discussed further below). This is why we do need to pass to the reduced phase space.

In order to exhibit the key structure of the reduced phase space without getting distracted by the local jet bundle geometry, we first discuss now the simple form in which it would appear after transgression (def. 82) if spacetime were compact, so that, by the principle of extremal action (prop. 45), it would be the derived critical locus (dS=0d S = 0) of a globally defined action functional SS. This is example 100 below.

This serves as a warmup to the true construction of the derived shell in the action Lie algebroid of the jet bundle, where the action functional is “de-transgressed” to the Lagrangian density, which is invariant under gauge transformations only up to horizontally exact terms. This culminates in example 82 below.

\,

The key to understanding the “derived reduced prolonged shell”, and hence the reduced phase space, as a derived critical locus is first to exhibit the Euler-Lagrange variation of the action functional, or rather of the Lagrangian density, as a section of the analog of a cotangent bundle, but now in the realm of Lie ∞-algebroids (prop. 80 and prop. 81 below). To this end we need to first of all consider homomorphisms of Lie algebroids:

Definition

(homomorphism between Lie algebroids)

Given two derived Lie algebroids 𝔞\mathfrak{a}, 𝔞\mathfrak{a}' (def. 114), then a homomorphism between them

f:𝔞𝔞 f \;\colon\; \mathfrak{a} \longrightarrow \mathfrak{a}'

is a dg-algebra-homomorphism between their Chevalley-Eilenberg algebras going the other way around

CE(𝔞)CE(𝔞):f * CE(\mathfrak{a}) \longleftarrow CE(\mathfrak{a}') \;\colon\; f^\ast

such that this covers an algebra homomorphism on the function algebras (a “non-curved sh-map”)

CE(𝔞) f * CE(𝔞) C (X) (f| X) * C (Y). \array{ CE(\mathfrak{a}) &\overset{f^\ast}{\longleftarrow}& CE(\mathfrak{a}') \\ \downarrow && \downarrow \\ C^\infty(X) &\underset{(f\vert_X)^\ast}{\longleftarrow}& C^\infty(Y) } \,.
Example

(gauge invariant functions in terms of Lie algebroids)

Let X/𝔤X/\mathfrak{g} be an action Lie algebroid (example 115) and regard the real line 1\mathbb{R}^1 as a Lie algebroid by example 95. Then homomorphisms of Lie algebroids (def. 116) of the form

f:X/𝔤 1, f \;\colon\; X/\mathfrak{g} \longrightarrow \mathbb{R}^1 \,,

hence smooth functions on the Lie algebroid, are equivalently

Proof

An \mathbb{R}-algebra homomorphism

CE(X/𝔤)C ( 1) CE( X/\mathfrak{g} ) \longleftarrow C^\infty(\mathbb{R}^1)

is fixed by what it does to the canonical coordinate function on 1\mathbb{R}^1, which is taken by f *f^\ast to fC (X)CE(X/𝔤)f \in C^\infty(X) \hookrightarrow CE(X/\mathfrak{g}). For this to be a dg-algebra homomorphism it needs to respect the differentials on both sides. Since the differential on right right is trivial, the condition is that 0=d CEf=ρ()(f)0 = d_{CE} f = \rho(-)(f).

Given a gauge invariant function, hence a function S:X/𝔤S \colon X/\mathfrak{g} \to \mathbb{R} on a Lie algebroid (example 98), its exterior derivative dSd S should be a section of the cotangent bundle of the Lie algebroid. Moreover, if all field variations are infinitesimal (as in def. 84) then it should in fact be a section of the infinitesimal neighbourhood (example 27) of the zero section inside the cotangent bundle, the infinitesimal cotangent bundle T *(X/𝔤)T^\ast(X/\mathfrak{g}) of the Lie algebroid (def. 117 ebelow).

To motivate the definition 117 below of infinitesimal cotangent bundle of a Lie algebroid recall from example 27 that the algebra of functions on the infinitesimal cotangent bundle should be fiberwise the formal power series algebra in the linear functions. But a fiberwise linear function on a cotangent bundle is by definition a vector field. Finally observe that vector fields are equivalently derivations of smooth functions (prop. 1). This leads to the following definition:

Definition

(infinitesimal cotangent Lie algebroid)

Let 𝔞\mathfrak{a} be a Lie ∞-algebroid (def. 114) over some manifold XX. Then its infinitesimal cotangent bundle T *𝔞T^\ast \mathfrak{a} is the Lie ∞-algebroid over XX whose underlying graded module over C (X)C^\infty(X) is the direct sum of the original module with the derivations of the graded algebra underlying CE(𝔞)CE(\mathfrak{a}):

(T *𝔞) *𝔞 *Der(CE(𝔞)) (T^\ast \mathfrak{a})^\ast_\bullet \;\coloneqq\; \mathfrak{a}^\ast_\bullet \oplus Der(CE(\mathfrak{a}))_\bullet

with differential on the summand 𝔞\mathfrak{a} being the original differential and on Der(CE(𝔞))Der(CE(\mathfrak{a})) being the commutator with the differential d CE(𝔞)d_{CE(\mathfrak{a})} on CE(𝔞)CE(\mathfrak{a}) (which is itself a graded derivation of degree +1):

d CE(T *𝔞)| 𝔞 *d CE(𝔞) d_{CE(T^\ast \mathfrak{a})}\vert_{\mathfrak{a}^\ast} \;\coloneqq\; d_{CE(\mathfrak{a})}
d CE(T *𝔞)| Der(𝔞)[d CE(𝔞),]. d_{CE(T^\ast \mathfrak{a})}\vert_{Der(\mathfrak{a})} \;\coloneqq\; [d_{CE(\mathfrak{a})},-] \,.

There is a canonical homomorphism of Lie algebroids (def. 116)

(141)T *𝔞𝔞 T^\ast \mathfrak{a} \longrightarrow \mathfrak{a}

given dually by the identity on the original generators.

Example

(infinitesimal cotangent bundle of action Lie algebroid)

Let X/𝔤X/\mathfrak{g} be an action Lie algebroid (def. 115) where

  • X= nX = \mathbb{R}^n is a Cartesian space with coordinates (ϕ a)(\phi^a);

  • 𝔤\mathfrak{g} is a Lie algebra with linear basis (c α)(c_\alpha) and corresponding structure constants (γ α βγ)(\gamma^{\alpha}{}_{\beta \gamma});

  • the infinitesimal action is given in components by

    d CEϕ a=R α ac α d_{CE} \;\phi^a\; = R^a_\alpha c^\alpha

    for smooth functions R α aR^a_\alpha on XX.

Then the infinitesimal cotangent Lie algebroid T *(X/𝔤)T^\ast (X/\mathfrak{g}) (def. 117) has as underlying cochain complex has generators

(c α) (ϕ a),(ϕ a) (c α) deg= 1 0 +1. \array{ & \left( \frac{\partial}{\partial c^\alpha} \right) & \left( \phi^a \right) , \left( \frac{\partial}{\partial \phi^a} \right) & \left( c^\alpha \right) \\ deg = & -1 & 0 & +1 } \,.

The CE-differential on the new derivation generators is given by

(142)d CE(T *(X/𝔤))(c α) [d CE(X/𝔤),c α] =R α aϕ a+γ β αγc γc β \begin{aligned} d_{CE(T^\ast(X/\mathfrak{g}))} \left( \frac{\partial}{\partial c^\alpha} \right) & \coloneqq \left[d_{CE(X/\mathfrak{g})}, \frac{\partial}{\partial c^\alpha} \right] \\ & = R_\alpha^a \frac{\partial}{\partial \phi^a} + \gamma^\beta{}_{\alpha \gamma} c^\gamma \frac{\partial}{\partial c^\beta} \end{aligned}

and

(143)d CE(T *(X/𝔤))(ϕ a) [d CE(X/𝔤),ϕ a] =c αR α bϕ aϕ b. \begin{aligned} d_{CE(T^\ast(X/\mathfrak{g}))} \left( \frac{\partial}{\partial \phi^a} \right) & \coloneqq \left[ d_{CE(X/\mathfrak{g})}, \frac{\partial}{\partial \phi^a} \right] \\ & = c^\alpha \frac{\partial R_\alpha^b}{\partial \phi^a} \frac{\partial}{\partial \phi^b} \end{aligned} \,.

To ease the notation one abbreviates

(144)ϕ¯ aϕ aAAAAAc¯ αc α \overline{\phi}_a \;\coloneqq\; \frac{\partial}{\partial \phi^a} \phantom{AAAAA} \overline{c}_\alpha \;\coloneqq\; \frac{\partial}{\partial c^\alpha}

so that the generator content then reads

(145) (c¯ α) (ϕ a),(ϕ¯ a) (c α) deg= 1 0 +1. \array{ & \left( \overline{c}_\alpha \right) & \left( \phi^a \right) , \left( \overline{\phi}_a \right) & \left( c^\alpha \right) \\ deg = & -1 & 0 & +1 } \,.

In this notation the full action of the CE-differential is therefore the following:

(146) d CE(T *(X/𝔤)) ϕ a c αR α a c α 12γ α βγc βc γ ϕ¯ a +c αR α bϕ aϕ¯ b c¯ α R α aϕ¯ a+γ β αγc γc¯ β \array{ & d_{CE(T^\ast(X/\mathfrak{g}))} \\ \phi^a &\mapsto& c^\alpha R^a_\alpha \\ c^\alpha & \mapsto& \tfrac{1}{2} \gamma^\alpha{}_{\beta \gamma} c^\beta c^\gamma \\ \overline{\phi}_a &\mapsto& + c^\alpha \frac{\partial R_\alpha^b}{\partial \phi^a} \overline{\phi}_b \\ \overline{c}_\alpha &\mapsto& R_\alpha^a \overline{\phi}_a + \gamma^\beta{}_{\alpha \gamma} c^\gamma \overline{c}_\beta }
Proposition

(exterior differential of gauge invariant function is section of infinitesimal cotangent bundle)

For 𝔞\mathfrak{a} an Lie ∞-algebroid (def. 114) over some XX; and S:𝔞S \;\colon\;\mathfrak{a} \longrightarrow \mathbb{R} a gauge invariant smooth function on it (example 98); there is an induced section dSd S of the infinitesimal cotangent Lie algebroid (def. 117) bundle projection (141):

T *𝔞 dS 𝔞 = 𝔞, \array{ && T^\ast \mathfrak{a} \\ & {}^{\mathllap{d S}}\nearrow & \downarrow \\ \mathfrak{a} &=& \mathfrak{a} } \,,

given dually

(dS) *:CE(T *𝔞)CE(𝔞) (d S)^\ast \;\colon\; CE(T^\ast \mathfrak{a}) \longrightarrow CE(\mathfrak{a})

by the map which sends

  1. the generators in 𝔞 *\mathfrak{a}^\ast to themselves;

  2. a vector field vv on XX, regarded as a degree-0 derivation to dS(v)=v(S)C (X)d S(v) = v(S) \in C^\infty(X);

  3. all other derivations to zero.

Proof

We discuss the proof in the special case of example 99. The general case is directly analogous.

We need to check that (dS) *(d S)^\ast respects the CE-differentials.

On the original generators in 𝔞 *\mathfrak{a}^\ast this is immediate, since on these the CE-differential on both sides are by definition the same.

On the derivation ϕ a\frac{\partial}{ \partial \phi^a} we find from (143)

{Sϕ a} (dS) * {ϕ a} d CE(𝔞) d CE(T *𝔞) {c αR α bϕ aSϕ b} (dS) * {c αR α bϕ aϕ b} \array{ \left\{ \frac{\partial S}{\partial \phi^a} \right\} &\overset{(d S)^\ast}{\longleftarrow}& \left\{ \frac{\partial}{\partial \phi^a} \right\} \\ {}^{\mathllap{d_{CE(\mathfrak{a})}}}\downarrow && \downarrow^{\mathrlap{d_{CE(T^\ast \mathfrak{a})}}} \\ \left\{ c^\alpha \frac{\partial R_\alpha^b}{\partial \phi^a} \frac{\partial S}{\partial \phi^b} \right\} &\underset{(d S)^\ast}{\longleftarrow}& \left\{ c^\alpha \frac{\partial R_\alpha^b}{\partial \phi^a} \frac{\partial}{\partial \phi^b} \right\} }

and on the derivation c α\frac{\partial}{\partial c^\alpha} we find from (142) and using the gauge invariance of SS

{0} (dS) * {c α} {0=R α aSϕ a} (dS) * {R α aϕ a+γ β αγc γc β}. \array{ \left\{ 0 \right\} &\overset{(d S)^\ast}{\longleftarrow}& \left\{ \frac{\partial}{\partial c^\alpha} \right\} \\ \downarrow && \downarrow \\ \left\{ 0 = R_\alpha^a \frac{\partial S}{\partial \phi^a} \right\} &\overset{(d S)^\ast}{\longleftarrow}& \left\{ R_\alpha^a \frac{\partial}{\partial \phi^a} + \gamma^\beta{}_{\alpha \gamma} c^\gamma \frac{\partial}{\partial c^\beta} \right\} } \,.
Definition

(derived critical locus of gauge invariant function on Lie ∞-algebroid)

Let 𝔞\mathfrak{a} be a Lie ∞-algebroid (def. 114) over some XX, let

S:𝔞 S \;\colon\; \mathfrak{a} \longrightarrow \mathbb{R}

be a gauge invariant function (example 98) and consider the section of its infinitesimal cotangent bundle T *𝔞T^\ast \mathfrak{a} (def. 99) corresponding to its exterior derivative via prop. 80.

𝔞 dS T *𝔞 𝔞 \array{ \mathfrak{a} && \overset{d S}{\longrightarrow} && T^\ast \mathfrak{a} \\ & \searrow && \swarrow \\ && \mathfrak{a} }

Then the derived critical locus of SS is the derived Lie algebroid (def. 114) to be denoted 𝔞 dS0\mathfrak{a}_{d S \simeq 0} which is the homotopy pullback of the section dSd S along the zero section:

𝔞 dS0 𝔞 (pb) 0 𝔞 dS T *𝔞. \array{ \mathfrak{a}_{d S \simeq 0} &\longrightarrow& \mathfrak{a} \\ \downarrow &(pb)& \downarrow^{\mathrlap{0}} \\ \mathfrak{a} &\underset{d S}{\longrightarrow}& T^\ast \mathfrak{a} } \,.

This means equivalently (details are at derived critical locus) that the Chevalley-Eilenberg algebra of 𝔞 dS0\mathfrak{a}_{d S \simeq 0} is like that of the infinitesimal cotangent Lie algebroid T *𝔞T^\ast \mathfrak{a} (def. 117) except for two changes:

  1. all derivations are shifted down in degree by one;

  2. the CE-differential on the derivations coming from vector fields vv on XX is that of the infinitesimal cotangent Lie algebroid T *𝔞T^\ast \mathfrak{a} plus dS(v)=v(S)d S(v) = v(S).

Example

(archetype of the BV-BRST complex)

Consider a gauge invariant function S:X/𝔤S \colon X/\mathfrak{g} \to \mathbb{R} (def. 98) on an action Lie algebroid (def. 115) for the case that the underlying manifold is a Cartesian space with global coordinates as in example 99. Then the generators of the derived critical locus (X/𝔤) dS0(X/\mathfrak{g})_{d S \simeq 0} (def. 118) are as in (145), except for the degree shift:

(c¯ α) (ϕ¯ a) (ϕ a) (c α) deg= 2 1 0 +1 \array{ & \left( \overline{c}^\alpha \right) & \left( \overline{\phi}^a \right) & \left( \phi^a \right) & \left( c^\alpha \right) \\ deg = & -2 & -1 & 0 & +1 }

and the CE-differential is given by

d CE(𝔞 dS0) ϕ a c αR α a c α 12γ α βγc βc γ ϕ¯ a Sϕ anew+c αR α bϕ aϕ¯ b c¯ α R α aϕ¯ a+γ β αγc γc¯ β \array{ & d_{CE(\mathfrak{a}_{d S \simeq 0})} \\ \phi^a &\mapsto& c^\alpha R^a_\alpha \\ c^\alpha & \mapsto& \tfrac{1}{2} \gamma^\alpha{}_{\beta \gamma} c^\beta c^\gamma \\ \overline{\phi}_a &\mapsto& \underset{ new }{ \underbrace{ \frac{\partial S}{\partial \phi^a} }} + c^\alpha \frac{\partial R_\alpha^b}{\partial \phi^a} \overline{\phi}_b \\ \overline{c}_\alpha &\mapsto& R_\alpha^a \overline{\phi}_a + \gamma^\beta{}_{\alpha \gamma} c^\gamma \overline{c}_\beta }

which is like the differential (146) of the cotangent Lie algebroid from example 99, except for the degree-shift by -1 of the derivation generators and except for the crucial new term indicated by the underbrace.

If we think of the function SS as being the action functional (example 66) of a Lagrangian field theory (E,L)(E,\mathbf{L}) (def. 60) over a compact spacetime Σ\Sigma, with XX the space of field histories (or rather an infinitesimal neighbourhood therein), hence with 𝔤\mathfrak{g} a Lie algebra of gauge symmetries acting on the field histories, then the Chevalley-Eilenberg algebra CE(X/𝔤 dS0)CE(X/\mathfrak{g}_{d S \simeq 0}) of the derived critical locus of SS is called the BV-BRST complex of the theory.

In applications of interest, the spacetime Σ\Sigma is not compact. In that case one may still appeal to a construction on the space of field histories as in example 100 by considering the action functional for all adiabatically switched bLb \mathbf{L} Lagrangians, with bC cp (Σ)b \in C_{cp}^\infty(\Sigma). This approach is taken in (Fredenhagen-Rejzner 11a).

Here we instead consider now the “local lift” or “de-transgression” of the above construction from the space of field histories to the jet bundle of the field bundle of the theory, refining the BV-BRST complex to the local BV-BRST complex, corresponding to the local BRST complex from example 97.

This requires a slight refinement of the construction that leads to example 100: In contrast to the action functional S=τ Σ(L)S = \tau_\Sigma(\mathbf{L}), the Lagrangian density L\mathbf{L} is not strictly invariant under implicit infinitesimal gauge transformations, in general, rather it may change up to a horizontally exact term (by the very definition \ref{ImplicitInfinitesimalGaugeSymmetry}). The same is then true for its Euler-Lagrange variational derivative δ ELL\delta_{EL} \mathbf{L}. This means that δ ELL\delta_{EL} \mathbf{L} is not a section of the infinitesimal cotangent bundle (def. 117) of the gauge action Lie algebroid on the jet bundle, but by a local version of it, which is twisted by horizontally exact terms.

The following definition 119 is the local refinement of def. 117:

Definition

(local infinitesimal cotangent Lie algebroid)

Let (E,L)(E,\mathbf{L}) be a Lagrangian field theory (def. 60) over some spacetime Σ\Sigma, and let 𝒢gbΣ\mathcal{G} \overset{gb}{\to} \Sigma be a bundle of closed irreducible gauge parameters (def. 23), inducing via example 97 the Lie algebroid

E/(𝒢× ΣTΣ)=(J Σ (E× Σ(𝒢[1])),s BRST)) E / ( \mathcal{G} \times_\Sigma T \Sigma ) \;=\; \left( J^\infty_\Sigma( E \times_\Sigma (\mathcal{G}[1]) ) , s_{BRST} ) \right)

whose Chevalley-Eilenberg algebra is the local BRST complex of the field theory.

Consider the case that both the field bundle EfbΣE \overset{fb}{\to} \Sigma (def. 34) as well as the gauge parameter bundle 𝒢gbΣ\mathcal{G} \overset{gb}{\to} \Sigma are trivial vector bundles (example 9) over Minkowski spacetime with field coordinates (ϕ a)(\phi^a) and gauge parameter coordinates (ϵ α)(\epsilon^\alpha).

Then the vertical infinitesimal cotangent Lie algebroid (def. 117) has coordinates as in (145) as well as all the corresponding jets and including also the horizontal differentials:

(c¯ α,μ 1μ k) (ϕ ,μ 1μ k a),(ϕ¯ a,μ 1μ k) (c ,μ 1μ k α),(dx μ) deg= 1 0 +1. \array{ & \left( \overline{c}_{\alpha,\mu_1 \cdots \mu_k} \right) & \left( \phi^a_{,\mu_1 \cdots \mu_k} \right) , \left( \overline{\phi}_{a,\mu_1 \cdots \mu_k} \right) & \left( c^\alpha_{,\mu_1 \cdots \mu_k} \right), \left( d x^\mu \right) \\ deg = & -1 & 0 & +1 } \,.

Observe that in terms of these coordinates the ordinary commutator of graded derivations has the following succinct expression:

(…)

Now consider the modification of this formula to the formula

(147){fdvol Σ,g} +(1) |f|+(p+1)(d kdx μ 1dx μ k(δ ELfδϕ a))(gϕ¯ a,μ 1μ k)(d kdx μ 1dx μ k(δ ELfδϕ¯ a))(gϕ ,μ 1μ k a) =+(1) |f|+(p+1)((d kdx μ 1dx μ k(δ ELfδc α))(gc¯ α,μ 1μ k)(d kdx μ 1dx μ k(δ ELfδc¯ α))(gc ,μ 1μ k α)) \begin{aligned} \left\{ f dvol_\Sigma, g \right\} & \coloneqq \phantom{+ (-1)^{\vert f \vert + (p+1)}} \left( \frac{d^k}{d x^{\mu_1} \cdots d x^{\mu_k}} \left( \frac{\delta_{EL} f}{\delta \phi^a} \right) \right) \left( \frac{\partial g}{\partial \overline{\phi}_{a,\mu_1 \cdots \mu_k}} \right) - \left( \frac{d^k}{d x^{\mu_1} \cdots d x^{\mu_k}} \left( \frac{\delta_{EL} f}{\delta \overline{\phi}_a} \right) \right) \left( \frac{\partial g}{\partial \phi^a_{,\mu_1 \cdots \mu_k}} \right) \\ & \phantom{=} + (-1)^{\vert f \vert + (p+1)} \left( \left( \frac{d^k}{d x^{\mu_1} \cdots d x^{\mu_k}} \left( \frac{\delta_{EL} f}{\delta c^\alpha} \right) \right) \left( \frac{\partial g}{\partial \overline{c}_{\alpha,\mu_1 \cdots \mu_k}} \right) - \left( \frac{d^k}{d x^{\mu_1} \cdots d x^{\mu_k}} \left( \frac{\delta_{EL} f}{\delta \overline{c}_\alpha} \right) \right) \left( \frac{\partial g}{\partial c^\alpha_{,\mu_1 \cdots \mu_k}} \right) \right) \end{aligned}

where δ EL\delta_{EL} denotes the Euler-Lagrange variational derivative.

We define the CE-differential on functions on T *(E/(𝒢× ΣTΣ))T^\ast( E/(\mathcal{G} \times_\Sigma T \Sigma) ) to be

d CE(T Σ *(E/(𝒢× ΣTΣ))){s BRSTdvol Σ,} d_{CE(T^\ast_\Sigma( E/(\mathcal{G} \times_\Sigma T \Sigma) ))} \;\coloneqq\; \left\{ s_{BRST} dvol_\Sigma, - \right\}

This defines an L L_\infty-algebroid to be denoted

T Σ *(E/(𝒢× ΣTΣ)). T^\ast_\Sigma( E/(\mathcal{G} \times_\Sigma T \Sigma) ) \,.

The local refinement of prop. 80 is now this:

Proposition

(Euler-Lagrange form is section of local cotangent bundle of jet bundle gauge-action Lie algebroid)

Let (E,L)(E,\mathbf{L}) be a Lagrangian field theory (def. 60) over some spacetime Σ\Sigma, and let 𝒢gbΣ\mathcal{G} \overset{gb}{\to} \Sigma be a bundle of closed irreducible gauge parameters (def. 23), inducing via example 97 the Lie algebroid E/(𝒢× ΣTΣ)E / ( \mathcal{G} \times_\Sigma T \Sigma ) and via def. 119 its local cotangent Lie ∞-algebroid T Σ *(E/(𝒢× ΣTΣ))T^\ast_\Sigma(E / ( \mathcal{G} \times_\Sigma T \Sigma )).

Then the Euler-Lagrange variational derivative (prop. 22) constitutes a section of the local cotangent Lie ∞-algebroid (def. 119)

T Σ *(E/(𝒢× ΣTΣ)) δ ELL E/(𝒢× ΣTΣ) = E/(𝒢× ΣTΣ) \array{ && T^\ast_\Sigma\left( E/(\mathcal{G} \times_\Sigma T \Sigma) \right) \\ & {}^{\mathllap{ \delta_{EL} \mathbf{L} }}\nearrow & \downarrow \\ E/(\mathcal{G} \times_\Sigma T \Sigma) &=& E/(\mathcal{G} \times_\Sigma T \Sigma) }

given dually

CE(E/(𝒢× ΣTΣ))(δ ELL) *CE(T *(E/(𝒢× ΣTΣ))) CE(E/(\mathcal{G} \times_\Sigma T\Sigma)) \overset{(\delta_{EL}\mathbf{L})^\ast}{\longleftarrow} CE(T^\ast(E/(\mathcal{G}\times_\Sigma T \Sigma)))

by

{ϕ ,μ 1μ k a} {ϕ ,μ 1μ k a} {c ,μ 1μ k α} {c ,μ 1μ k α} {d kdx μ 1dx μ k(δ ELLδϕ a)} {ϕ¯ a,μ 1μ k} {0} {c¯ α,μ 1μ k} \array{ \left\{ \phi^a_{,\mu_1 \cdots \mu_k} \right\} &\longleftarrow& \left\{ \phi^a_{,\mu_1 \cdots \mu_k} \right\} \\ \left\{ c^\alpha_{,\mu_1 \cdots \mu_k} \right\} &\longleftarrow& \left\{ c^\alpha_{,\mu_1 \cdots \mu_k} \right\} \\ \left\{ \frac{d^k}{ d x^{\mu_1} \cdots d x^{\mu_k}} \left( \frac{\delta_{EL} L}{\delta \phi^a} \right) \right\} &\longleftarrow& \left\{ \overline{\phi}_{a,\mu_1 \cdots \mu_k} \right\} \\ \left\{ 0 \right\} &\longleftarrow& \left\{ \overline{c}_{\alpha,\mu_1 \cdots \mu_k} \right\} }
Proof

The proof of this proposition is a special case of the observation that the differentials involved are part of the local BV-BRST differential; this will be a direct consequence of the proof of prop. 82 below.

The local analog of def. 118 is now the following definition 120 of the “derived prolonged shell” of the theory (recall the ordinary prolonged shell J Σ (E)\mathcal{E}^\infty \hookrightarrow J^\infty_\Sigma(E) from (49)):

Definition

(derived reduced prolonged shell)

Let (E,L)(E,\mathbf{L}) be a Lagrangian field theory (def. 60) over some spacetime Σ\Sigma, and let 𝒢gbΣ\mathcal{G} \overset{gb}{\to} \Sigma be a bundle of closed irreducible gauge parameters (def. 23), inducing via prop. 81 a section δ ELL\delta_{EL} L of the local cotangent Lie algebroid of the jet bundle gauge-action Lie algebroid.

Then the derived prolonged shell BV \mathcal{E}^\infty_{BV} is the derived critical locus of δ ELL\delta_{EL} L, hence the homotopy pullback of δ ELL\delta_{EL} L along the zero section of the local cotangent Lie \infty-algebroid:

BV E/(𝒢× ΣTΣ) 0 E/(𝒢× ΣTΣ) δ ELL T Σ *(E/(𝒢× ΣTΣ)) \array{ \mathcal{E}^\infty_{BV} &\longrightarrow& E/( \mathcal{G} \times_\Sigma T \Sigma ) \\ \downarrow && \downarrow^{\mathrlap{0}} \\ E/(\mathcal{G} \times_\Sigma T \Sigma) &\underset{\delta_{EL} L}{\longrightarrow}& T^\ast_\Sigma\left( E/( \mathcal{G} \times_\Sigma T \Sigma ) \right) }

The local refinement of example 100 is now the following:

Proposition

(local BV-BRST complex)

Let (E,L)(E,\mathbf{L}) be a Lagrangian field theory with bundle of closed irreducible gauge parameters 𝒢\mathcal{G}

…. the Chevalley-Eilenberg algebra of the derived prolonged shell BV \mathcal{E}^\infty_{BV} (def. 120) is the local BV-BRST complex

Proof

By unwinding the definitions analogous to the proof of example 99, the CE-differential is given by the modified bracket of derivations (147) with the sum of the BRST-differential and the Lagrangian density:

d CE(T Σ *(E/(𝒢× ΣTΣ)))={(L+L BRST)dvol Σ,} d_{CE( T^\ast_\Sigma(E/(\mathcal{G} \times_\Sigma T\Sigma)) )} \;=\; \left\{ \left(L + L_{BRST}\right) dvol_\Sigma \;,\; - \right\}

only that in the homotopy fiber BV \mathcal{E}^\infty_{BV} the derivations receive a degree-shift by -1 compared to their degree in T *(E/(𝒢× ΣTΣ))T^\ast(E /( \mathcal{G} \times_\Sigma T \Sigma )).

This operation is the local BV-BRST differential by (Barnich-Henneaux 96 (2.12)-(2.13)).

Example

(derived prolonged shell in the absence of explicit gauge symmetry – the local BV-complex)

Let (E,L)(E,\mathbf{L}) be a Lagrangian field theory (def. 60) with no non-trivial infinitesimal gauge symmetries made explicit (possibly because there are none, as for the scalar field), hence with no ghost fields introduced. Then the local derived critical locus of its Lagrangian density (def. 120) is the local BV-complex of def. 85.

Example

(local BV-complex of vacuum electromagnetism on Minkowski spacetime)

Consider the Lagrangian field theory of free electromagnetism on Minkowski spacetime (example 40) with gauge parameter as in example 92. With the field and gauge parameter coordinates as chosen in these examples

((a μ),c) \left( (a_\mu), c \right)

then the local BV-BRST complex (prop. 82) has generators

c¯ a¯ μ a μ c deg= 2 1 0 1 \array{ & \overline{c} & \overline{a}^\mu & a_\mu & c \\ deg = & -2 & -1 & 0 & 1 }

together with their total spacetime derivatives, and the local BV-BRST differential ss acts on these generators as follows:

s:{a¯ μ f ,ν νμ (equations of Motion -- vacuum Maxwell equations) c¯ a¯ ,μ μ (Noether identity) a μ c ,μ (infinitesimal gauge transformation) s \;\colon\; \left\{ \array{ \overline{a}^\mu &\mapsto& f^{\nu \mu}_{,\nu} & \text{(equations of Motion -- vacuum Maxwell equations)} \\ \overline{c} &\mapsto& \overline{a}^\mu_{,\mu} & \text{(Noether identity)} \\ a_\mu &\mapsto& c_{,\mu} & \text{(infinitesimal gauge transformation)} } \right.

\,

So far the discussion yields just the algebra of functions on the derived reduced prolonged shell. We now discuss the derived analog of the full variational bicomplex (def. 59) to the derived reduced shell.

The analog of the de Rham complex of a derived Lie algebroid is called the Weil algebra:

Definition

(Weil algebra of a Lie algebroid)

Given a derived Lie algebroid 𝔞\mathfrak{a} over some XX (def. 114), its Weil algebra is

W(𝔞)(Sym C (X)(Γ(T *X)𝔞 𝔞[1] ),d Wd+d CE), W(\mathfrak{a}) \;\coloneqq\; \left( Sym_{C^\infty(X)}( \Gamma(T^\ast X) \oplus \mathfrak{a}_\bullet \oplus \mathfrak{a}[1]_\bullet ) \;,\; \mathbf{d}_W \coloneqq \mathbf{d} + d_{CE} \right) \,,

where d\mathbf{d} acts as the de Rham differential d:C (X)Γ(T *X)\mathbf{d} \colon C^\infty(X) \to \Gamma(T^\ast X) on functions, and as the degree shift operator d:𝔞 𝔞[1] \mathbf{d} \colon \mathfrak{a}_\bullet \to \mathfrak{a}[1]_\bullet on the graded elements.

smooth manifoldsderived Lie algebroids
algebra of functionsChevalley-Eilenberg algebra
algebra of differential formsWeil algebra
Example

(classical Weil algebra)

Let 𝔤\mathfrak{g} be a Lie algebra with corresponding Lie algebroid B𝔤B \mathfrak{g} (example 95). Then the Weil algebra (def. 121) of B𝔤B \mathfrak{g} is the traditional Weil algebra of 𝔤\mathfrak{g} from classical Lie theory.

Definition

(variational BV-bicomplex?)

Let (E,L)(E,\mathbf{L}) be a Lagrangian field theory (def. 60) equipped with a closed irreducible gauge parameter bundle 𝒢\mathcal{G} (def. 23). Consider the Lie algebroid E/(𝒢× ΣTΣ)E/(\mathcal{G} \times_\Sigma T \Sigma) from example 97, whose Chevalley-Eilenberg algebra is the local BRST complex of the theory.

Then its Weil algebra W(E/(𝒢× ΣTΣ))W(E/(\mathcal{G} \times_\Sigma T \Sigma)) (def. 121) has as differential the variational derivative (def. 59) plus the BRST differential

d W =d(ds BRST) =δ+s BRST. \begin{aligned} d_{W} & = \mathbf{d} - (d - s_{BRST}) \\ & = \delta + s_{BRST} \end{aligned} \,.

Therefore we speak of the variational BRST-bicomplex? and write

Ω Σ (E/(𝒢× ΣTΣ)). \Omega^\bullet_\Sigma( E/(\mathcal{G} \times_\Sigma T \Sigma) ) \,.

Similarly, the Weil algebra of the derived prolonged shell BV \mathcal{E}^\infty_{BV} (def. 120) has differential

d W =d(ds) =δ+s. \begin{aligned} d_W & = \mathbf{d} - (d - s) \\ & = \delta + s \end{aligned} \,.

Since ss is the BV-BRST differential (prop. 82) this defines the “variational BV-BRST-bicomplex”.

(…)

\,

It turns out that the local BV-BRST cohomology (prop. 82) of the “derived reduced prolonged shell” very neatly captures all the aspects of Lagrangian field theory that we have been discussing so far:

Example

(Noether theorem I in terms of local BRST cohomology)

The dsd-s-closed elements in degree (p,0)(p,0) are precisely pairs (v,J v)(v,J_v) consisting of an implicit infinitesimal local gauge symmetry vv and a conserved current J vJ_v for it.

The d Wd_W-exact elements in this degree are sums of

  1. dd-exact currents;

  2. on-shell vanishing implicit gauge transformations;

  3. on-shell vanishing currents with their horizontally exact gauge transformations

(…)

Proof

The d Wd_W-closed element are the implicit infinitesimal gauge symmetries vv regarded as an antifield v aϕ¯ av^a \overline{\phi}_a multiplied with the volume form dvol Σdvol_\Sigma together with their Noether current J vΩ Σ p,0(E)J_v \in \Omega^{p,0}_\Sigma(E) (prop. 30)

{J v} d {dJ vι vδ ELL=0} s BV {v aϕ¯ advol Σ} \array{ \{J_v\} &\overset{d}{\longrightarrow}& \{ \overset{= 0}{\overbrace{ d J_v - \iota_v \delta_{EL}\mathbf{L} }} \} \\ && \uparrow\mathrlap{s_{BV}} \\ && \{ v^a \overline{\phi}_a dvol_\Sigma\} }

Such a pair is exact if

{K} d {dK+v aμδ ELLδϕ aι μdvol Σ} d {dJ vι vδ ELL=0} s BV s BV v aμϕ¯ aι μdvol Σ d {v aϕ¯ advol Σ} s BV κ abϕ¯ aϕ¯ bdvol Σ \array{ \{K\} &\overset{d}{\longrightarrow}& \{ d K + v^{a \mu} \frac{\delta_{EL}L}{\delta \phi^a} \iota_{\partial_\mu} dvol_\Sigma \} &\overset{d}{\longrightarrow}& \{ \overset{= 0}{\overbrace{ d J_v - \iota_v \delta_{EL}\mathbf{L} }} \} \\ && \mathllap{s_{BV}}\uparrow && \uparrow\mathrlap{s_{BV}} \\ && v^{a \mu} \overline{\phi}_a \iota_{\partial_\mu} dvol_\Sigma &\underset{-d}{\longrightarrow}& \{ v^a \overline{\phi}_a dvol_\Sigma\} \\ && && \uparrow\mathrlap{s_{BV}} \\ && && \kappa^{a b } \overline{\phi}_a \overline{\phi}_b dvol_\Sigma }

(…)

Example

(infinitesimal gauge symmetry via local BRST cohomology)

An infinitesimal gauge symmetry v ϵv_\epsilon of gauge parameter (ϵ α)(\epsilon^\alpha) is a vector field on the jet bundle with components of the form

v ϵϕ aR α aϵ α+R α aμdϵ αdx μ \mathcal{L}_{v_\epsilon} \phi^a \;\coloneqq\; R^a_\alpha \epsilon^\alpha + R^{a \mu}_\alpha \frac{d \epsilon^\alpha}{d x^\mu}

such that this is an infinitesimal symmetry of the Lagrangian in that

ι v ϵδ ELL =v aδ ELLδϕ advol Σ =ϵ α(R α aδ ELLδϕ addx μ(R α aμδ ELLδϕ a))dvol Σ+d(ϵ αR α aμδ ELLδϕ a)ι μdvol Σ =0+d(...) \begin{aligned} \iota_{v_\epsilon} \delta_{EL} \mathbf{L} & = v^a \frac{\delta_{EL} L}{\delta \phi^a} dvol_\Sigma \\ & = \epsilon^\alpha \left( R^a_\alpha \frac{\delta_{EL} L}{ \delta \phi^a} - \frac{d}{d x^\mu} \left( R^{a \mu}_\alpha \frac{\delta_{EL} L}{\delta \phi^a} \right) \right) dvol_\Sigma + d\left( \epsilon^\alpha R^{a \mu}_\alpha \frac{\delta_{EL} L}{\delta \phi^a} \right) \iota_{\partial_\mu} dvol_\Sigma \\ & = 0 + d(...) \end{aligned}

for all (ϵ α)(\epsilon^\alpha).

The corresponding anti ghost field c¯ α\overline{c}_\alpha are taken by the BV-BRST differential to the antifield-preimage of the term on the left:

s(c¯ α)=R α aϕ¯ addx μ(R α aμϕ¯ a). s\left(\overline{c}_\alpha\right) \;=\; R^a_\alpha \overline{\phi}_a - \frac{d}{d x^\mu} \left( R^{a \mu}_\alpha \overline{\phi}_a \right) \,.

Moreover, an on-shell vanishing infinitesimal symmetry of the Lagrangian is a vector field with components of the form

κ abδ ELLδϕ a \kappa^{a b} \frac{\delta_{EL} L}{\delta \phi^a}

for κ ab=κ ba\kappa^{a b} = - \kappa^{b a} a skew-symmetric system of smooth functions on the jet bundle.

The linear combination of such an infinitesimal gauge symmetry and an on-shell vanishing infinitesimal symmetry is (s+d)(s+d)-exact:

v advol Σ =(R α aϵ α+R α aμdϵ αdx μ+κ abδ ELLδϕ a)dvol Σ =s(ϵ αc¯ α12κ abϕ¯ aϕ¯ b)dvol σ+d(ϵ αR α aμ)ι μdvol Σ \begin{aligned} v^a dvol_\Sigma & = \left( R^a_\alpha \epsilon^\alpha + R^{a \mu}_\alpha \frac{d \epsilon^\alpha}{d x^\mu} + \kappa^{a b} \frac{\delta_{EL} L }{ \delta \phi^a } \right) dvol_\Sigma \\ & = s \left( \epsilon^\alpha \overline{c}_\alpha - \tfrac{1}{2}\kappa^{a b} \overline{\phi}_a \overline{\phi}_b \right) dvol_\sigma + d\left( \epsilon^\alpha R^{a \mu}_\alpha \right) \iota_{\partial_\mu} dvol_\Sigma \end{aligned}

(Barnich-Brandt-Henneaux 94, p. 20)

It may be useful to organize this expression into the s+ds+d-bicomplex like so:

{K} d {dK+ϵ αR α aμδ ELLδϕ a} d {dJ vι vδ ELL=0} s s ϵ αR α aμϕ¯ aι μdvol Σ d {d(ϵ αR α aμϕ¯ a)ι μdvol Σ+(R α aϵ α+R α aμdϵ αdx μ+κ abδ ELLδϕ a)ϕ¯ advol Σ} s (ϵ αc¯ α+12κ abϕ¯ aϕ¯ b)dvol Σ \array{ \{K\} &\overset{d}{\longrightarrow}& \{ d K + \epsilon^\alpha R^{a \mu}_\alpha \frac{\delta_{EL}\mathbf{L}}{ \delta \phi^a} \} &\overset{d}{\longrightarrow}& \{ \overset{= 0}{\overbrace{ d J_v - \iota_v \delta_{EL}\mathbf{L} }} \} \\ && \mathllap{s}\uparrow && \uparrow\mathrlap{-s} \\ && \epsilon^\alpha R^{a \mu}_\alpha \overline{\phi}_a \iota_{\partial_\mu} dvol_\Sigma &\underset{d}{\longrightarrow}& \left\{ d\left( \epsilon^\alpha R^{a \mu}_\alpha \overline{\phi}_a \right) \iota_{\partial_\mu} dvol_\Sigma + \left( R^a_\alpha \epsilon^\alpha + R^{a \mu}_\alpha \frac{d \epsilon^\alpha}{d x^\mu} + \kappa^{a b} \frac{\delta_{EL} L }{ \delta \phi^a } \right) \overline{\phi}_a \, dvol_\Sigma \right\} \\ && && \uparrow\mathrlap{-s} \\ && && \left( - \epsilon^\alpha \overline{c}_\alpha + \tfrac{1}{2}\kappa^{a b } \overline{\phi}_a \overline{\phi}_b \right) dvol_\Sigma }

(…)

Gauge fixing

We had seen above that the key intermediate construction for obtaining the quantum field theory induced from a Lagrangian density is its covariant phase space (prop. 46). But then we have seen that there are generically obstructions to the existence of the covariant phase space, embodied by infinitesimal gauge transformation that have been “left implicit” (prop. 77). We have then discussed the reduced phase space above which makes the infinitesimal gauge symmetries “explicit” by forming their homotopy quotient, whose algebra of functions, on the derived shell, is the BV-BRST complex of the theory. It remains to show that this construction of the reduced phase space indeed serves to lift the obstruction to the existence of the covariant phase space. This is the topic of gauge fixing (def. … below).

The point is that while the reduced phase space reflected by the BV-BRST complex may still not be manifestly covariant, its existence as an object in homotopy theory, here specifically in homological algebra, means that it comes with a more flexible concept of “equality”, namely homotopy equivalence, which here specifically means quasi-isomorphism.

Broadly speaking, the gauge principle in physics, says that no two things (field histories, etc.) are ever really equal, instead they may be connected by gauge transformations, and the mathematical reflection of that is the principle of homotopy theory, where no two homotopy types are ever equal, instead they may be connected by (weak) homotopy equivalence. Hence picking a specific representative of a homotopy type means to fix a gauge.

Concretely, let (E,L)(E,\mathbf{L}) be a Lagrangian field theory with closed irreducibe gauge parameter bundle 𝒢\mathcal{G} and let BV(s)BV(s) be the corresponding BV-BRST complex (prop. 82).

We then ask for another field bundle E gE^g, possibly itself already a graded manifold, hence an object in higher differential geometry, and then we ask for a Lagrangian density L g\mathbf{L}^g that may also genuinely live in higher prequantum geometry, hence which is defined right away on the action Lie algebroid E g(𝒢× ΣT Σ)E^g(\mathcal{G} \times_\Sigma T_\Sigma) (example 97) not necessarily descending to there from E gE^g itself. We may still form the local derived critical locus of L g\mathbf{L}^g in E g(𝒢× ΣTΣ)E^g(\mathcal{G} \times_\Sigma T \Sigma) and obtain a corresponding BV-BRST-like complex BV(s g)BV(s^g).

We ask now that BV(s g)BV(s^g) has particularly good properties:

  1. We ask that the Koszul-Tate component s BV gs^g_{BV} of s gs^g has vanishing cochain cohomology in negative degree, which means by prop. 83 that the Lagrnagian L g\mathbf{L}^g on the graded field bundle E gE^g is degreewise free of the obstruction to the existence of a covariant phase space.

  2. We ask moreover that the remaining Chevalley-Eilenberg component s BRST gs^g_{BRST} of s gs^g is compatible with the graded Poisson bracket of this graded covariant phase space

This means that the “dg-Lagrangian field theory ”(E g,L g)(E^g, \mathbf{L}^g) induces a covariant reduced phase spaceinternal todg-manifolds; hence a “dg-covariant reduced phase space”: a graded covariant reduced phase space equipped with compatible differentials.

Such derived phase spaces BV(s g)BV(s^g) are amenable to degreewise quantization (discussed below) if only one can keep the degreewise quantization compatible with the differential. This may be shown (below…) to be the case, and hence performing the quantization degreewise and passing in the end to the cochain cohomology of the resulting BV-BRST complex of quantum observables yields the gauge invariant local observables of the quantum field theory. This is called the “BV-BRST quantization of gauge theories”.

In order to apply this to the Lagrangian field theory (E,L)(E,\mathbf{L}) that we actually started out to consider, we now only need to ensure that the “manifestly covariant” dg-Lagrangian field theory (E g,L g)(E^g, \mathbf{L}^g) is not necessarily equal to (E,L)(E,\mathbf{L}), but homotopy equivalent to it, as an object in higher prequantum geometry, hence that there is a quasi-isomorphism between the corresponding BV-BRST complexes

BV(s g)BV(s). BV(s^g) \overset{\simeq}{\longrightarrow} BV(s) \,.

The choice of this quasi-isomorphism hence means a choice of particularly good (namely manifestly covariant) representative of the homotopy type of BV(s)BV(s), and hence this is called a gauge fixing of BV(s)BV(s).

\,

pre-quantum geometry̲ higher pre-quantum geometry̲ {Lagrangian field theory with implicit infinitesimal gauge transformations} explicategauge transformations {dg-Lagrangian field theory with explicit infinitesimal gauge transformations embodied by BRST complex } pass toderived critical locus {dg-reduced phase space embodied by BV-BRST complex } fix gauge { decategorified covariant reduced phase space } pass to cohomology { dg-covariant reduced phase space } quantize degreewise {gauge invariant quantum observables} pass to cohomology {quantum BV-BRST complex} \array{ \underline{\mathbf{\text{pre-quantum geometry}}} && \underline{\mathbf{\text{higher pre-quantum geometry}}} \\ \, \\ \left\{ \array{ \text{Lagrangian field theory with} \\ \text{implicit infinitesimal gauge transformations} } \right\} &\overset{ \text{explicate} \atop \text{gauge transformations} }{\longrightarrow}& \left\{ \array{ \text{dg-Lagrangian field theory with} \\ \text{explicit infinitesimal gauge transformations} \\ \text{ embodied by BRST complex } } \right\} \\ && \Big\downarrow{}^{\mathrlap{ \text{pass to} \atop \text{derived critical locus} }} \\ \Big\downarrow && \left\{ \array{ \text{dg-reduced phase space} \\ \text{ embodied by BV-BRST complex } } \right\} \\ && {}^{\mathllap{\simeq}}\Big\downarrow{}^{\mathrlap{\text{fix gauge} }} \\ \left\{ \array{ \text{ decategorified } \\ \text{ covariant } \\ \text{ reduced phase space } } \right\} &\underset{\text{pass to cohomology}}{\longleftarrow}& \left\{ \array{ \text{ dg-covariant} \\ \text{reduced phase space } } \right\} \\ && \Big\downarrow{}^{\mathrlap{ \array{ \text{ quantize } \\ \text{degreewise} } }} \\ \left\{ \array{ \text{gauge invariant} \\ \text{quantum observables} } \right\} &\underset{\text{pass to cohomology}}{\longleftarrow}& \left\{ \array{ \text{quantum} \\ \text{BV-BRST complex} } \right\} }

Here:

termmeaning
“phase space”derived critical locus of Lagrangian equipped with Poisson bracket
“reduced”gauge transformations have been homotopy-quotiented out
“covariant”Cauchy surfaces exist degreewise

\,

In practice this choice of gauge fixing by choice of quasi-isomorphism to a “manifestly covariant” BV-BRST complex is realized as the composite of two seperate quasi-isomorphisms:

  1. an “anti-canonical transformation”

    e {ψ,}:BV(s g)BV(exp {ψ,}s g) e^{-\{\psi,-\}} \;\colon\; BV(s^g) \longrightarrow BV( \exp^{-\{\psi,-\}} s^g )

    (induced by a degree -1 element ψ\psi called, for better or worse, the “gauge fixing fermion”) which is actually a genuine isomorphism, not just a quasi-isomorphism;

  2. a genuine quasi-isomorphism which contracts away a contractible direct summand of auxiliary fields

    BV(e {ψ,}s g)forget contractible complexof auxiliary fieldsBV(s). BV( e^{-\{\psi,-\}} s^g ) \overset{ \text{forget contractible complex} \atop \text{of auxiliary fields} }{\longrightarrow} BV(s) .

(…)

Example

(Nakanishi-Lautrup gauge fixing of vacuum electromagnetism)

Consider the local BV-BRST complex

(E/(𝒢× ΣTΣ)) dL0 \left( E/(\mathcal{G} \times_\Sigma T \Sigma) \right)_{d L \simeq 0}

for vacuum electromagnetism on Minkowski spacetime from example 102:

The field bundle is ET *ΣE \coloneqq T^\ast \Sigma and the gauge parameter bundle is 𝒢Σ×\mathcal{G} \coloneqq \Sigma \times \mathbb{R}. The 0-jet generators are

c¯ a¯ μ a μ c deg= 2 1 0 1 \array{ & \overline{c} & \overline{a}^\mu & a_\mu & c \\ deg = & -2 & -1 & 0 & 1 }

and the differential acts as

s:{a¯ μ f ,ν νμ (equations of Motion -- vacuum Maxwell equations) c¯ a¯ ,μ μ (Noether identity) a μ c ,μ (infinitesimal gauge transformation) s \;\colon\; \left\{ \array{ \overline{a}^\mu &\mapsto& f^{\nu \mu}_{,\nu} & \text{(equations of Motion -- vacuum Maxwell equations)} \\ \overline{c} &\mapsto& \overline{a}^\mu_{,\mu} & \text{(Noether identity)} \\ a_\mu &\mapsto& c_{,\mu} & \text{(infinitesimal gauge transformation)} } \right.

The Lagrangian density for vacuum electromagnetism is (42)

(148)L em12f μνf μν. \mathbf{L}_{em} \coloneqq \tfrac{1}{2} f_{\mu \nu} f^{\mu \nu} \,.

Consider the contractible chain complex of vector bundles over Σ\Sigma

NL( Σ×c˜ c˜b Σ×b deg= 1 0). NL \;\coloneqq\; \left( \array{ & \Sigma \times \langle \tilde c\rangle &\overset{\tilde c \mapsto b}{\longrightarrow}& \Sigma \times\langle b\rangle \\ deg = & -1 && 0 } \right) \,.

In this context Σ×b\Sigma \times \langle b\rangle is called the field bundle for the Nakanishi-Lautrup field and Σ×c˜\Sigma \times \langle \tilde c\rangle that for the antighost field.

The corresponding product BV-BRST complex quasi-isomorphic to the original one

(E/(𝒢× ΣTΣ)) dL0× ΣNL \left( E/(\mathcal{G} \times_\Sigma T \Sigma) \right)_{d L \simeq 0} \times_\Sigma NL

has coordinate generators

c¯ a¯ μ a μ c c˜ b deg= 2 1 0 1. \array{ & \overline{c} & \overline{a}^\mu & a_\mu & c \\ & & \tilde c & b \\ deg = & -2 & -1 & 0 & 1 } \,.

We say that the Nakanishi-Lautrup gauge fixing fermion for Gaussian averaged Lorentz gauge is

(149)ψc˜(b+a ,μ μ)CE((E/(𝒢× ΣTΣ)) dL0× ΣNL). \psi \coloneqq \tilde c( b + a^{\mu}_{,\mu} ) \;\in\; CE\left( \left( E/(\mathcal{G} \times_\Sigma T \Sigma) \right)_{d L \simeq 0} \times_\Sigma NL \right) \,.

With L NLL_{NL} denoting the anti-Hamiltonin for the differential s NLs_{NL} of the resolved local BV-BRST complex (E/(𝒢× ΣTΣ)) dL0× ΣNL \left(E/(\mathcal{G} \times_\Sigma T \Sigma) \right)_{d L \simeq 0} \times_\Sigma NL we find from (148) and (149) the antibracket

{ψ,L NL}=b(b+a ,μ μ)+c˜ ,μc ,μ \{\psi,L_{NL}\} = b ( b + a^{\mu}_{,\mu} ) + \tilde c_{,\mu} c^{,\mu}

and then

{ψ,{ψ,L NL}}=0. \{\psi, \{\psi, L_{NL}\}\} = 0 \,.

Therefore the corresponding gauge fixed Lagrangian density is

L g e {ψ,}L NL =(12f μνf μν+b(b+a ,μ μ)+c˜ ,μc ,μ)dvol Σ. \begin{aligned} \mathbf{L}^g & \coloneqq \e^{\{\psi,-\}}\mathbf{L}_{NL} \\ & = \left( \tfrac{1}{2} f_{\mu \nu} f^{\mu \nu} + b ( b + a^{\mu}_{,\mu} ) + \tilde c_{,\mu} c^{,\mu} \right) dvol_\Sigma \end{aligned} \,.

(see also Henneaux 90, section 9.1)

The Euler-Lagrange equation of motion induced by this Lagrangian density L g\mathbf{L}^g (def 61) are

(150){f μν ,μ =b ,ν b =12a ,μ μ c ,μ ,μ =0 c˜ ,μ ,μ =0AAAAAA{a ν =0 b =12diva c =0 c˜ =0 \left\{ \begin{aligned} f^{\mu \nu}{}_{,\mu} & = b^{,\nu} \\ b & = -\tfrac{1}{2} a^\mu_{,\mu} \\ c_{,\mu}{}^{,\mu} & = 0 \\ \tilde c_{,\mu}{}^{,\mu} & = 0 \end{aligned} \right. \phantom{AAA} \Leftrightarrow \phantom{AAA} \left\{ \begin{aligned} \Box a_\nu & = 0 \\ b & = - \tfrac{1}{2} div a \\ \Box c & = 0 \\ \Box \tilde c & = 0 \end{aligned} \right.

Here on the left we show the equations as the appear directly from the Euler-Lagrange variational derivative (prop. 22). The operator \Box on the right is the wave operator (example 25) and divdiv denotes the divergence. The equivalence to the equations on the right follows from using in the first first equation the derivative of the second equation on the left, which is

b ,ν=12a μ,ν ,μ, b^{,\nu} = -\tfrac{1}{2} a^{\mu,\nu}{}_{,\mu} \,,

and recalling the definition of the universal Faraday tensor (30):

f μν ,μ=12(a ν,μ ,μa μ,ν ,μ). f^{\mu \nu}{}_{,\mu} = \tfrac{1}{2} \left( a^{\nu,\mu}{}_{,\mu} - a^{\mu,\nu}{}_{,\mu} \right) \,.

The differential equations on the right are manifestly a system of normally hyperbolic differential equations, as opposed to the plain vacuum Maxwell equations on Minkowski spacetime ddx μf μν=0\frac{d}{d x^\mu} f^{\mu \nu} = 0 (see also Rejzner 16, section 7.2).

(…)

Proposition 77 implies that we need a good handle on determining whether the space of implicit infinitesimal gauge symmetries modulo trivial ones is non-zero. This obstruction turns out to be neatly captured by methods of homological algebra applied to the local BV-complex (def. 85):

Example

(cochain cohomology of local BV-complex)

Let (E,L)(E,\mathbf{L}) be a Lagrangian field theory (def. 60) whose field bundle EE is a trivial vector bundle (example 9) and whose Lagrangian density L\mathbf{L} is spacetime-independent (example 24), and let Σ×{φ}\Sigma \times \{\varphi\} \hookrightarrow \mathcal{E} be a constant section of the shell (56).

By inspection we find that the cochain cohomology of the local BV-complex Ω Σ,cp 0,0(E,φ)| BV\Omega^{0,0}_{\Sigma,cp}(E,\varphi)\vert_{\mathcal{E}_{BV}} (def. 85) has the following interpretation:

In degree 0 the image of the BV-differential coming from degree -1 and modulo dd-exact terms

im(Γ Σ,cp(J Σ T Σ(E,φ))s BVΩ Σ 0,0(E,φ)/im(d)) im\left( \Gamma_{\Sigma,cp}(J^\infty_\Sigma T_\Sigma(E,\varphi)) \overset{s_{BV}}{\to} \Omega^{0,0}_\Sigma(E,\varphi)/im(d) \right)

is the ideal of functions modulo im(d)im(d) that vanish on-shell. Since the differential going from degree 0 to degree 1 vanishes, the cochain cohomology in this degree is the quotient ring

H 0(Ω Σ,cp 0,0(E,φ)| BV|d) Ω Σ,cp 0,0(E,φ)| /im(d) \begin{aligned} H^0\left(\Omega^{0,0}_{\Sigma,cp}(E,\varphi)\vert_{\mathcal{E}_{BV}}\vert d\right) & \simeq \Omega^{0,0}_{\Sigma,cp}(E,\varphi)\vert_{\mathcal{E}}/im(d) \end{aligned}

of functions on the shell \mathcal{E} (94).

In degree -1 the kernel of the BV-differential going to degree 0

ker(Γ Σ,cp(J Σ T Σ(E,φ))s BVΩ Σ 0,0(E,φ)) ker\left( \Gamma_{\Sigma,cp}(J^\infty_\Sigma T_\Sigma(E,\varphi)) \overset{s_{BV}}{\to} \Omega^{0,0}_\Sigma(E,\varphi)\right)

is the space of implicit infinitesimal gauge symmetries (def. \ref{ImplicitInfinitesimalGaugeSymmetry}) and the image of the differential coming from degree -2

im(Γ Σ,cp(J Σ T ΣE,φ) Ω Σ,cp 0,0(E,φ)Γ Σ,cp(J Σ T ΣE,φ)s BVΓ Σ,cp(J Σ T ΣE,φ)) im\left( \Gamma_{\Sigma,cp}(J^\infty_\Sigma T_\Sigma E,\varphi) \wedge_{\Omega^{0,0}_{\Sigma,cp}(E,\varphi)} \Gamma_{\Sigma,cp}(J^\infty_\Sigma T_\Sigma E,\varphi) \overset{s_{BV}}{\longrightarrow} \Gamma_{\Sigma,cp}(J^\infty_\Sigma T_\Sigma E,\varphi) \right)

is the trivial implicit infinitesimal gauge transformations (example 91).

Therefore the cochain cohomology in degree -1 is the quotient space of implicit infinitesimal gauge transformations modulo the trivial ones:

(151)H 1(Ω Σ 0,0(E,φ)| BV){implicit infinitesimal gauge transformations}{ trivial implicit infinitesimal gauge transformations} H^{-1}\left( \Omega^{0,0}_\Sigma(E,\varphi)\vert_{\mathcal{E}_{BV}} \right) \simeq \frac{ \left\{ \text{implicit infinitesimal gauge transformations} \right\} } { \left\{ \text{ trivial implicit infinitesimal gauge transformations} \right\} }
Proposition

(local BV-complex is homological resolution of the shell precisely if there are no non-trivial implicit infinitesimal gauge symmetries)

Let (E,L)(E,\mathbf{L}) be a Lagrangian field theory (def. 60) whose field bundle EE is a trivial vector bundle (example 9) and whose Lagrangian density L\mathbf{L} is spacetime-independent (example 24) and let Σ×{φ}\Sigma \times \{\varphi\} \hookrightarrow \mathcal{E} be a constant section of the shell (56). Furthermore assume that L\mathbf{L} is at least quadratic in the vertical coordinates around φ\varphi.

Then the local BV-complex Ω Σ 0,0(E,φ)| BV\Omega^{0,0}_\Sigma(E,\varphi)\vert_{\mathcal{E}_{BV}} of local observables (def. 85) is a homological resolution of the algebra of functions on the infinitesimal neighbourhood of φ\varphi in the shell (example 24), hence the canonical comparison morphisms (98) is a quasi-isomorphism precisely if there is no non-trivial (example 91) implicit infinitesimal gauge symmetry (def. \ref{ImplicitInfinitesimalGaugeSymmetry}):

(Ω Σ 0,0(E,φ)| BVΩ Σ 0,0(E,φ)| )(there are no non-trivial implicit infinitesimal gauge transformations). \left( \Omega^{0,0}_{\Sigma}(E,\varphi)\vert_{\mathcal{E}_{BV}} \overset{\simeq}{\longrightarrow} \Omega^{0,0}_{\Sigma}(E,\varphi)\vert_{\mathcal{E}} \right) \;\Leftrightarrow\; \left( \array{ \text{there are no non-trivial} \\ \text{implicit infinitesimal gauge transformations} } \right) \,.
Proof

By example 107 the vanishing of non-trivial implicit infinitesimal gauge symmetries is equivalent to the vanishing of the cochain cohomology of the local BV-complex in degree -1 (151).

Therefore the statement to be proven is equivalently that the Koszul complex of the sequence of elements

(δ ELLδϕ aΩ Σ,φ 0,0(E)) a=1 s \left( \frac{\delta_{EL} L}{\delta \phi^a} \in \Omega^{0,0}_{\Sigma,\varphi}(E) \right)_{a = 1}^s

is a homological resolution of Ω Σ 0,0(E,φ)| \Omega^{0,0}_\Sigma(E,\varphi)\vert_{\mathcal{E}}, hence has vanishing cohomology in all negative degrees, already if it has vanishing cohomology in degree -1.

By a standard fact about Koszul complexes (this prop.) a sufficient condition for this to be the case is that

  1. the ring Ω Σ 0,0(E,φ)\Omega^{0,0}_{\Sigma}(E,\varphi) is the tensor product of C (Σ)C^\infty(\Sigma) with a Noetherian ring;

  2. the elements δ ELLδϕ a\frac{\delta_{EL} L }{\delta \phi^a} are contained in its Jacobson radical.

The first condition is the case since Ω Σ 0,0(E,φ)\Omega^{0,0}_{\Sigma}(E,\varphi) is by definition a formal power series ring over a field tensored with C (Σ)C^\infty(\Sigma) (by this example). Since the Jacobson radical of a power series algebra consists of those elements whose constant term vanishes (see this example), the assumption that L\mathbf{L} is at least quadratic, hence that δ ELL\delta_{EL}\mathbf{L} is at least linear in the fields, guarantees that all δ ELLδϕ a\frac{\delta_{EL}L}{\delta \phi^a} are contained in the Jacobson radical.

Prop. 83 says what gauge fixing has to accomplish: given a local BV-BRST complex we need to find a quasi-isomorphism to another complex which is such that it comes from a graded Lagrangian density whose BV-cohomology vanishes in degree -1 and hence induces a graded covariant phase space, and such that the remaining BRST differential respects the Poisson bracket on this graded covariant phase space.

(…)

\,

Quantization

Given any space with infinitesimal symmetries acting on it, there is the corresponding homotopy quotient by these infinitesimal symmetries. For the covariant phase space of a Lagrangian field theory, as above with its Poisson Lie algebra of infinitesimal symmetries (def. 90), this infinitesimal homotopy quotient is known as the Poisson Lie algebroid and the corresponding genuine homotopy quotient is known as the symplectic groupoid. As one passes from phase space to its symplectic groupoid, the algebra of functions on phase space – hence the algebra of observables ObsObs (def. 83) – which is always a commutative algebra deforms to the corresponding algebra of functions on a Lie groupoid called the (polarized) convolution algebra of a Lie groupoid. This is now a non-commutative algebra called the algebra of quantum observables Obs Obs_{\hbar}; and this passage from phase space to its symplectic groupoid homotopy quotient by Hamiltonian symmetries is called quantization (specifically: “geometric quantization of symplectic groupoids”). Here the strength of the non-commutativity is measured by a deformation parameter called Planck's constant \hbar.

Since the product in the algebra of quantum observables Obs Obs_\hbar differs from that in ObsObs, the positivity condition A *A0\langle A^\ast A\rangle \geq 0 in the definition of states \langle - \rangle of a field theory (def. 86) acquires a different meaning. The states after quantization are called quantum states and their difference witnesses that after quantization the Lagrangian field theory is of a different nature: one says that it is no longer a classical field theory, but a quantum field theory and that the objects whose states are expressed by these new quantum states are quantum fields.

Unfortunately, explicitly constructing the algebra of quantum observables of a Lagrangian field theory and hence “constructing the quantum field theory” turns out to be extremely hard, unless some simplifying assumptions are made.

One kind of simplification occurs when the spacetime dimension is very low. For instance if the spacetime dimension is taken to be p+1=1p+1 = 1 – modelling the approximation where one completely ignores the variation of fields in space and retains just their time evolution – then one speaks of quantum mechanics, which is well understood. Another simplification occurs when the field theory is a free field theory, meaning that its equation of motion is a normally hyperbolic linear differential operator. In this case the quantum field theory is fully understood as long as the underlying spacetime is a time-orientable and globally hyperbolic. But, as the name indicates, this captures only the case where there is no interaction among the fields.

Since the algebra of quantum observables Obs Obs_\hbar is a deformation with strength \hbar of the commutative algebra of classical observables ObsObs controlled by the Poisson Lie algebra, another simplification occurs if one gives up on the demand to understand the full deformation at finite value of Planck's constant \hbar and considers just infinitesimal values of \hbar. Since this means that the resulting quantum observables are no longer actual smooth functions of \hbar, but just formal power series, this is called formal deformation quantization. The resulting “infinitesimally quantized” field theory is called perturbative quantum field theory.

For interacting field theories in spacetime dimension p+13+1p+1 \geq 3+1 their quantization has been constructed to date only in perturbation theory this way. The construction of full non-perturbative quantum field theory (in dimension 3+1\geq 3+1 with non-vanishing interaction) is, at the time of this writing, a wide open problem.

But perturbative quantum field theory is well understood. This we turn to next…

(…)

Let VV be a vector space of finite dimension and let ωVV\omega \in V \otimes V be an element of the tensor product (not necessarily skew symmetric at the moment).

We may canonically regard VV as a smooth manifold, in which case ω\omega is canonically regarded as a constant rank-2 tensor. As such it has a canonical action by forming derivatives on the tensor product of the space of smooth functions:

ω:C (V)C (V)C (V)C (V). \omega \;\colon\; C^\infty(V) \otimes C^\infty(V) \longrightarrow C^\infty(V) \otimes C^\infty(V) \,.

If { i}\{\partial_i\} is a linear basis for VV, identified, as before, with a basis for Γ(TV)\Gamma(T V), then in this basis this operation reads

ω(fg)=ω ij( if)( jg), \omega(f \otimes g) \;=\; \omega^{i j} (\partial_i f) \otimes (\partial_j g) \,,

where iffx i\partial_i f \coloneqq \frac{\partial f}{\partial x^i} denotes the partial derivative of the smooth function ff along the iith coordinate, and where we use the Einstein summation convention.

For emphasis we write

C (V)C (V) prod C (V) fg fg \array{ C^\infty(V) \otimes C^\infty(V) &\overset{prod}{\longrightarrow}& C^\infty(V) \\ f \otimes g &\mapsto& f \cdot g }

for the pointwise product of smooth functions.

Definition

(star product induced by constant rank-2 tensor)

Given (V,ω)(V,\omega) as above, then the star product induced by ω\omega on the formal power series algebra C (V)[[]]C^\infty(V) [ [\hbar] ] in a formal variable \hbar (“Planck's constant”) with coefficients in the smooth functions on VV is the linear map

() ω():C (V)[[]]C (V)[[]]C (V)[[]] (-) \star_\omega (-) \;\colon\; C^\infty(V)[ [ \hbar ] ] \otimes C^\infty(V)[ [ \hbar ] ] \longrightarrow C^\infty(V)[ [\hbar] ]

given by

() ω()prodexp(ω ij i j) (-) \star_\omega (-) \;\coloneqq\; prod \circ \exp\left( \hbar \omega^{i j} \partial_i \otimes \partial_j \right)

Hence

f ωg1+ω ijfx igx j+ 212ω ijω kl 2fx ix k 2gx jx l+. f \star_\omega g \;\coloneqq\; 1 + \hbar \omega^{i j} \frac{\partial f}{\partial x^i} \cdot \frac{\partial g}{\partial x^j} + \hbar^2 \tfrac{ 1 }{2} \omega^{i j} \omega^{k l} \frac{\partial^2 f}{\partial x^{i} \partial x^{k}} \cdot \frac{\partial^2 g}{\partial x^{j} \partial x^{l}} + \cdots \,.
Proposition

(star product is associative and unital)

Given (V,ω)(V,\omega) as above, then the star product () ω()(-) \star_\omega (-) from def. 123 is associative and unital with unit the constant function 1C (V)C (V)[[]]1 \in C^\infty(V) \hookrightarrow C^\infty(V)[ [ \hbar ] ].

Hence the vector space C (V)C^\infty(V) equipped with the star product ω\omega is a unital associative algebra.

Proof

Observe that the product rule of differentiation says that

iprod=prod( iid+id i). \partial_i \circ prod = prod \circ ( \partial_i \otimes id \;+\; id \otimes \partial_i ) \,.

Using this we compute as follows:

(f ωg) ωh =prodexp(ω ij i j)((prodexp(ω kl k l))id)(fgg) =prodexp(ω ij i j)(prodid)(exp(ω kl k l)id)(fgg) =prod(prodid)exp(ω ij( iid j+id i j)exp(ω kl k l)id(fgg) =prod(prodid)exp(ω ij iid j)exp(ω ijid i j)exp(ω kl k lid)(fgg) =prod 3exp(ω ij( i jid+ iid j+id i j)) \begin{aligned} (f \star_\omega g) \star_\omega h & = prod \circ \exp( \omega^{i j} \partial_i \otimes \partial_j ) \circ \left( \left( prod \circ \exp( \omega^{k l} \partial_k \otimes \partial_l ) \right) \otimes id \right) (f \otimes g \otimes g) \\ & = prod \circ \exp( \omega^{i j} \partial_i \otimes \partial_j ) \circ (prod \otimes id) \circ \left( \exp( \omega^{k l} \partial_k \otimes \partial_l ) \otimes id \right) (f \otimes g \otimes g) \\ & = prod \circ (prod \otimes id) \circ \exp( \omega^{i j} ( \partial_i \otimes id \otimes \partial_j +id \otimes \partial_i \otimes \partial_j ) \circ \exp( \omega^{k l} \partial_k \otimes \partial_l ) \otimes id (f \otimes g \otimes g) \\ & = prod \circ (prod \otimes id) \circ \exp( \omega^{i j} \partial_i \otimes id \otimes \partial_j ) \circ \exp( \omega^{i j} id \otimes \partial_i \otimes \partial_j ) \circ \exp( \omega^{k l} \partial_k \otimes \partial_l \otimes id ) (f \otimes g \otimes g) \\ & = prod_3 \circ \exp( \omega^{i j} ( \partial_i \otimes \partial_j \otimes id + \partial_i \otimes id \otimes \partial_j + id \otimes \partial_i \otimes \partial_j) ) \end{aligned}

In the last line we used that the ordinary pointwise product of functions is associative, and wrote prod 3:C (V)C (V)C (V)C (V)prod_3 \colon C^\infty(V) \otimes C^\infty(V) \otimes C^\infty(V) \to C^\infty(V) for the unique pointwise product of three functions.

The last expression above is manifestly independent of the choice of order of the arguments in the triple star product, and hence it is clear that an analogous computation yields

=f ω(g ωh). \cdots = f \star_\omega (g \star_\omega h) \,.
Proposition

(shift by symmetric contribution is isomorphism of star products)

Let VV be a vector space, ωVV\omega \in V \otimes V a rank-2 tensor and αSym(VV)\alpha \in Sym(V \otimes V) a symmetric rank-2 tensor.

Then the linear map

C (V) exp(12α) C (V) f exp(12α ij i j)f \array{ C^\infty(V) &\overset{\exp\left(-\hbar\tfrac{1}{2}\alpha \right)}{\longrightarrow}& C^\infty(V) \\ f &\mapsto& \exp\left( -\hbar\tfrac{1}{2}\alpha^{i j} \partial_i \partial_j \right) f }

constitutes an isomorphism of star product algebras (prop. 84) of the form

exp(12α):(C (V)[[]], ω)(C (V))[[]], ω+α), \exp\left(-\hbar\tfrac{1}{2}\alpha \right) \;\colon\; (C^\infty(V)[ [\hbar] ], \star_\omega) \overset{\simeq}{\longrightarrow} (C^\infty(V))[ [\hbar] ], \star_{\omega + \alpha}) \,,

hence identifying the star product induced from ω\omega with that induced from ω+α\omega + \alpha.

In particular every star product algebra (C (V)[[]], ω)(C^\infty(V)[ [\hbar] ],\star_\omega) is isomorphic to a Moyal star product algebra 12π\star_{\tfrac{1}{2}\pi} (example 86) with 12π ij12(ω ijω ji)\tfrac{1}{2}\pi^{i j} \tfrac{1}{2}(\omega^{i j} - \omega^{j i}) the skew-symmetric part of ω\omega, this isomorphism being exhibited by α ij=12(ω ij+ω ji)\alpha^{i j} = - \tfrac{1}{2}(\omega^{i j} + \omega^{j i}) (minus) the symmetric part.

Proof

We need to show that

prodexp(ω)(exp(12α)exp(12α))=exp(12α)prodexp(ω). prod \circ \exp( \omega ) \circ \left( \exp\left( -\tfrac{1}{2}\alpha\right) \otimes \exp\left( -\tfrac{1}{2}\alpha \right) \right) \;=\; \exp\left( -\tfrac{1}{2}\alpha \right) \circ prod \circ \exp( \omega ) \,.

To this end, observe that the product rule of differentiation applied twice in a row implies that

i jprod=prod(( i j)id+id( i j)+ i j+ j i). \partial_i \partial_j \circ prod \;=\; prod \circ \left( (\partial_i \partial_j) \otimes id + id \otimes (\partial_i \partial_j) + \partial_i \otimes \partial_j + \partial_j \otimes \partial_i \right) \,.

Using this we compute

exp(12α ij i j)prodexp(ω ij k l) =prodexp(12α ij( i j)id+id( i j)+ i j+ j i)exp(ω ij k l) =prodexp((ω ij+α ij) i j)exp(12α ij( i j)id12α ijid( i j)) =prodexp((ω ij+α ij) i j)(exp(12α)exp(12α)) \begin{aligned} \exp\left( - \hbar\tfrac{1}{2}\alpha^{i j} \partial_i \partial_j \right) \circ prod \circ \exp( \hbar \omega^{i j} \partial_{k} \otimes \partial_l ) & = prod \circ \exp\left( - \hbar \tfrac{1}{2}\alpha^{i j} (\partial_i \partial_j) \otimes id + id \otimes (\partial_i \partial_j) + \partial_i \otimes \partial_j + \partial_j \otimes \partial_i \right) \circ \exp( \hbar \omega^{i j} \partial_{k} \otimes \partial_l ) \\ & = prod \circ \exp\left( \hbar (\omega^{i j} + \alpha^{i j}) \partial_i \otimes \partial_j \right) \circ \exp\left( - \hbar \tfrac{1}{2} \alpha^{i j} (\partial_i \partial_j) \otimes id - \hbar \tfrac{1}{2} \alpha^{i j} id \otimes (\partial_i \partial_j) \right) \\ & = prod \circ \exp\left( \hbar (\omega^{i j} + \alpha^{i j}) \partial_i \otimes \partial_j \right) \circ \left( \exp(-\hbar \tfrac{1}{2}\alpha) \otimes \exp(-\hbar \tfrac{1}{2}\alpha) \right) \end{aligned}

Some examples of star products as in def. 123:

Example

If ω=0\omega = 0 in def. 123, then the star product 0=\star_0 = \cdot is the plain pointwise product.

Example

(Moyal star product)

If ω=12π\omega = \tfrac{1}{2}\pi in def. 123 is skew-symmetric, it may be regarded as a constant Poisson tensor π\pi on the smooth manifold VV. In this case 12π\star_{\tfrac{1}{2}\pi} is called a Moyal star product and the star-product algebra C (V)[[]], π)C^\infty(V)[ [\hbar] ], \star_\pi) is called the Moyal deformation quantization of the Poisson manifold (V,π)(V,\pi).

(…)

\,

Free quantum fields

We discuss here the quantum observables for the special case of free field theories (def. 62). In perturbative quantum field theory this is the basis of the construction of all interacting theories in the infinitesimal neighbourhood of the free field theories.

\,

\,

Wick algebra and normal ordered products

(…)

To warm up, we first discuss how the star product (def. 123) of a finite dimensional vector space equipped with almost Kähler structure may be interpreted as “normal-ordered product for a single mode”:

Definition

(almost Kähler vector space)

An almost Kähler vector space is a complex vector space VV equipped with two bilinear forms σ,h:V V\sigma, h \;\colon\; V \otimes_{\mathbb{R}} V \longrightarrow \mathbb{R} such that with VV regarded as a smooth manifold and with σ,g\sigma, g regarded as constant tensors, then (V,σ,h)(V,\sigma,h) is an almost Kähler manifold.

Example

(standard almost Kähler vector spaces)

Let V 2V \coloneqq \mathbb{R}^2 equipped with the complex structure given by the canonical identification 2\mathbb{R}^2 \simeq \mathbb{C}, let σ(0 1 1 0)\sigma \coloneqq \left( \array{0 & -1 \\ 1 & 0} \right) and h(1 0 0 1)h \coloneqq \left( \array{ 1 & 0 \\ 0 & 1} \right). Then (V,σ,g)(V,\sigma,g) is an almost Kähler vector space (def. 124).

Definition

(Wick algebra of an almost Kähler vector space)

Let ( 2n,σ,g)(\mathbb{R}^{2n},\sigma, g) be an almost Kähler vector space (def. 124). Then its Wick algebra is the formal power series vector space [[a 1,a 1 *,,a n,a n *]][[]]\mathbb{C}[ [ a_1, a^\ast_1, \cdots, a_n, a^\ast_n ] ] [ [ \hbar ] ] equipped with the star product

P 1 ωP 2 prodexp(k 1,k 2=12nω ab a b)(P 1P 2) =P 1P 2+k 1,k 2=12nω k 1k 2( k 1P 1)( k 2P 2)+ \begin{aligned} P_1 \star_\omega P_2 & \coloneqq prod \circ \exp \left( \hbar\underoverset{k_1, k_2 = 1}{2 n}{\sum}\omega^{a b} \partial_a \otimes \partial_b \right) (P_1 \otimes P_2) \\ & = P_1 \cdot P_2 + \hbar \underoverset{k_1, k_2 = 1}{2n}{\sum}\omega^{k_1 k_2}(\partial_{k_1} P_1) \cdot (\partial_{k_2} P_2) + \cdots \end{aligned}

given by the bilinear form

ωi2σ+12g. \omega \coloneqq \tfrac{i}{2} \sigma + \tfrac{1}{2} g \,.

Here

prod:[[a 1,a 1 *,,a n,a n *]][[]] [[a 1,a 1 *,,a n,a n *]][[]][[a 1,a 1 *,,a n,a n *]][[]] prod \;\colon\; \mathbb{C}[ [ a_1, a^\ast_1, \cdots, a_n, a^\ast_n ] ] [ [ \hbar ] ] \otimes_{\mathbb{R}} \mathbb{C}[ [ a_1, a^\ast_1, \cdots, a_n, a^\ast_n ] ] [ [ \hbar ] ] \longrightarrow \mathbb{C}[ [ a_1, a^\ast_1, \cdots, a_n, a^\ast_n ] ] [ [ \hbar ] ]

is the ordinary (commutative) product in the formal power series algebra.

To make contact with the traditional notation we decorate the elements PP in the formal power series algebra with colons and declare the notation

:P 1::P 2::P 1 ωP 2: : P_1 : \, :P_2: \;\coloneqq\; : P_1 \star_\omega P_2 :
Example

(Wick algebra of a single mode)

Let V 2Span({x,y})V \coloneqq \mathbb{R}^2 \simeq Span(\{x,y\}) be a standard almost Kähler vector space according to example 109, with canonical coordinates denoted xx and yy. We discuss its Wick algebra according to def. 125 and show that this reproduces the traditional definition of products of “normal ordered” operators.

To that end, consider the complex linear combination of the coordinates to the canonical complex coordinates zx+iyz \coloneqq x + i y and z¯xiy\overline{z} \coloneqq x - i y, which we suggestively write instead as

a12(x+iy)AAAAa *12(xiy) a \coloneqq \tfrac{1}{\sqrt{2}}(x + i y) \phantom{AAAA} a^\ast \coloneqq \tfrac{1}{\sqrt{2}}(x - i y)

(with “aa” the traditional symbol for the amplitude of a field mode).

We find the value of the almost-Kähler forms on these elements to be

σ(a,a *) =12σ((x+iy),(xiy)) =i2(σ(x,y)σ(y,x)) =i \begin{aligned} \sigma(a,a^\ast) & = \tfrac{1}{2} \sigma( (x + i y), (x - i y) ) \\ & = \tfrac{-i}{2}( \sigma(x,y) - \sigma(y,x) ) \\ & = - i \end{aligned}
h(a,a *) =12h((x+iy),(xiy)) =12(h(x,x)+h(y,y)) =1 \begin{aligned} h(a, a^\ast) & = \tfrac{1}{2} h( (x + i y), (x - i y) ) \\ & = \tfrac{1}{2}(h(x,x) + h(y,y)) \\ & = 1 \end{aligned}
σ(a,a) =σ(a *,a *) =0AAAAAAby anti-symmetry \begin{aligned} \sigma(a,a) & = \sigma(a^\ast, a^\ast) \\ & = 0 \phantom{AAAAAA} \text{by anti-symmetry} \end{aligned}
h(a,a) =12(h((x+iy),(x+iy))) =12(h(x,x)h(y,y)) =0 \begin{aligned} h(a,a) & = \tfrac{1}{2}( h( (x + i y), (x + i y) ) ) \\ & = \tfrac{1}{2}( h(x,x) - h(y,y)) \\ & = 0 \end{aligned}
h(a *,a *) =12(h(x,x)h(y,y)) =0 \begin{aligned} h(a^\ast, a^\ast) & = \tfrac{1}{2}( h(x,x) - h(y,y) ) \\ & = 0 \end{aligned}

Using this, we find the star product as follows (where we write ()()(-)\cdot (-) for the plain commutative product in the formal power series algebra):

a ωa * =aa *+2(iσ(a,a *)=1+h(a,a *)=1) =a *a+ a * ωa =a *a+2(iσ(a *,a)=1+h(a *,a)=1) =a *a a ωa =aa a * ωa * =a *a * \begin{aligned} a \star_\omega a^\ast & = a \cdot a^\ast + \tfrac{\hbar}{2} \left( \underset{= 1}{\underbrace{i \sigma( a, a^\ast )}} + \underset{= 1}{\underbrace{h(a, a^\ast)}} \right) \\ & = a^\ast \cdot a + \hbar \\ \\ a^\ast \star_\omega a & = a^\ast \cdot a + \tfrac{\hbar}{2} \left( \underset{= -1}{\underbrace{i \sigma( a^\ast , a )}} + \underset{ = 1}{\underbrace{h( a^\ast, a )}} \right) \\ & = a^\ast \cdot a \\ \\ a \star_\omega a & = a \cdot a \\ \\ a^\ast \star_\omega a^\ast & = a^\ast \cdot a^\ast \end{aligned}

These four cases are sufficient to see that in the star-product P 1 ωP 2P_1 \star_\omega P_2 of general elements, we obtain correction term \hbar to the ordinary commutative product precisely for every pair consisting of a factor of aa in P 1P_1 and a factor a *a^\ast in P 2P_2. This is exactly the “normal ordering” prescription.

\,

Now to generalize this to the infinite dimensional case of free field theory:

Definition

(microcausal functionals)

Let (X,g)(X,g) be a globally hyperbolic spacetime.

Write mc dist\mathcal{F}_{mc} \subset \mathcal{F}_{dist} for the subalgebra of smooth functionals

C (X) C^\infty(X) \longrightarrow \mathbb{R}

on the smooth space of smooth functions on XX which is generated from those distributions on some Cartesian product X nX^n whose wave front set excludes those covectors to a point in X nX^n all whose components are in the future cone or all whose components are in the past cone.

(After deformation quantization below, the distributions appearing in def. 126 are the origin of “operator-valued distributions” in perturbative quantum field theory).

Example

(regular functionals are microcausal)

Every regular functional is a microcausal functional (def. 126), since the wave front set of a distribution that is given by an ordinary function is empty:

reg mc. \mathcal{F}_{reg} \subset \mathcal{F}_{mc} \,.
Example

(adiabtaically switched point interactions are microcausal)

Let gC c (X)g \in C^\infty_c(X) be a bump function, then for nn \in \mathbb{N} the smooth functional

C (X) ϕ Xg(x)(ϕ(x)) ndvol(x) \array{ C^\infty(X) &\overset{}{\longrightarrow}& \mathbb{R} \\ \phi &\mapsto& \int_X g(x) (\phi(x))^n dvol(x) }

is a microcausal functional (def. 126).

If here we think of ϕ(x) n\phi(x)^n as a point-interaction term (as for instance in phi^4 theory) then gg is to be thought of as an “adiabatically switchedcoupling constant. These are the relevant interaction terms to be quantized via causal perturbation theory.

Proof

For notational convenience, consider the case n=2n = 2, the other cases are directly analogous. The distribution in question is the delta distribution

Xg(x)ϕ(x) 2dvol(x)= X×Xg(x 1)ϕ(x 1)ϕ(x 2)dvol(x 1)dvol(x 2)=gδ(,),(ϕpr 1)(ϕpr 2) g. \int_X g(x) \phi(x)^2 dvol(x) \;=\; \int_{X \times X} g(x_1) \phi(x_1) \phi(x_2) dvol(x_1) dvol(x_2) = \langle g \cdot \delta(-,-) , (\phi \circ pr_1)\cdot (\phi \circ pr_2) \rangle_g \,.

Now for (x 1,x 2)X×X(x_1, x_2) \in X \times X and 2nUX×X\mathbb{R}^{2n} \simeq U \subset X \times X a chart around this point, the Fourier transform of gδ(,)g \cdot \delta(-,-) restricted to this chart is proportional to the Fourier transform g^\hat g of gg evaluated at the sum of the two covectors:

(k 1,k 2) 2ng(x 1)δ(x 1,x 2)exp(k 1x 1+k 2x 2)dvol(x 1)dvol(x 2) g^(k 1+k 2). \begin{aligned} (k_1, k_2) & \mapsto \int_{\mathbb{R}^{2n}} g(x_1) \delta(x_1, x_2) \exp( k_1 \cdot x_1 + k_2 \cdot x_2 ) dvol(x_1) dvol(x_2) \\ & \propto \hat g(k_1 + k_2) \end{aligned} \,.

Since gg is a plain bump function, its Fourier transform g^\hat g is quickly decaying (in the sense of wave front sets) with k 1+k 2k_1 + k_2 (this prop.). Thus only on the cone k 1+k 2=0k_1 + k_2 = 0 that function is in fact constant and in particular not decaying.

This means that the wave front set consists of the element of the form (x,(k,k))(x, (k, -k)) with k0k \neq 0. Since kk and k-k are both in the future cone or both in the past cone precisely if k=0k = 0, this situation is excluded in the wave front set and hence the distribution gδ(,)g \cdot \delta(-,-) is microcausal.

(graphics grabbed from Khavkine-Moretti 14, p. 45)

This shows that microcausality in this case is related to conservation of momentum in th point interaction.

More generally:

Proposition

(Hadamard-Moyal star product on microcausal functionals)

Let (X,g)(X,g) be a globally hyperbolic spacetime, and let ω𝒟(X×X)\omega \in \mathcal{D}'(X \times X) be a Hadamard distribution (def. \ref{HadamardDistribution}) which is guaranteed to exist by prop. \ref{ExistenceOfHadamardDistributions}.

Then the star product

P 1 ωP 2prodexp( X 2ω(x 1,x 2)δδϕ(x 1)δδϕ(x 2)dvol g)(P 1P 2) P_1 \star_\omega P_2 \;\coloneqq\; prod \circ \exp\left( \int_{X^2} \hbar \omega(x_1, x_2) \frac{\delta}{\delta \phi(x_1)} \otimes \frac{\delta}{\delta \phi(x_2)} dvol_g \right) (P_1 \otimes P_2)

on microcausal functionals P 1,P 2 mcP_1, P_2 \in \mathcal{F}_{mc} is well defined in that the products of distributions that appear in expanding out the exponential are such that the sum of the wave front sets of the factors does not intersect the zero section.

Proof

By definition of Hadamard distribution, the wave front set of powers of ω\omega has all cotangents on the first variables future pointing, and all those on the second variables past pointing. The first variables are integrated against those of P 1P_1 and the second against P 2P_2. By definition of microcausal functionals, the wave front sets of P 1P_1 and P 2P_2 are disjoint from the subsets where all components are future pointing or all components are past-pointing. Therefore the relevant sum of of the wave front covectors never vanishes.

See Collini 16, p. 25-26

Definition

(Wick algebra of free quantum field)

Let (X,g)(X,g) be a globally hyperbolic spacetime and let ω𝒟(X×X)\omega \in \mathcal{D}'(X \times X) be a Hadamard distribution (def. \ref{HadamardDistribution}) which is guaranteed to exist by prop. \ref{ExistenceOfHadamardDistributions}.

Then the Wick algebra of quantum observables of the free scalar field on (X,g)(X,g) is the space of microcausal functionals mc\mathcal{F}_{mc} (def. 126) equipped with the Hadamard-Moyal star product from prop. 87:

𝒲(X,ω)( mc, ω). \mathcal{W}(X,\omega) \;\coloneqq\; \left( \mathcal{F}_{mc}, \star_\omega \right) \,.

need to quotient out ideal of elements in the image of g+m 2\Box_g + m^2 to go on shell

In Minkowski spacetime the Hadamard state is simply the usual vacuum state |vac\vert vac \rangle, hence the Hadamard distribution is, as a generalized function

ω(x,y)=vac|Φ(x)Φ(y)|vac. \omega(x,y) = \langle vac \vert \Phi(x) \Phi(y) \vert vac \rangle \,.

Therefore the abstractly defined Wick algebra as in def. 127 in this case satisfies the relation

Xf(x,y):Φ(x)::Φ(y):dvol g= Xf(x,y)(:Φ(x)Φ(y):vac|Φ(x)Φ(y)|vac)dvol g. \int_{X} f(x,y) \; :\Phi(x): \, :\Phi(y): \; dvol_g \;=\; \int_X f(x,y) \left( :\Phi(x) \Phi(y): - \langle vac \vert \Phi(x) \Phi(y) \vert vac \rangle \right) \; dvol_g \,.

This is the traditional expression for the normal ordered Wick product on Minkowski spacetime (e.g. here).

\,

Scattering

We consider now the axioms for a perturbative S-matrix of a Lagrangian field theory as used in causal perturbation theory (def. 128 below). Since, by definition, the S-matrix is a formal sum of multi-linear continuous functionals, it is convenient to impose axioms on these directly: this is the axiomatics for time-ordered products in def. 129 below. That these latter axioms already imply the former is the statement of prop. 91 below. Its proof requires a close look at the “reverse-time ordered products” for the inverse S-matrix (def. 131 below) and their induced reverse-causal factorization (prop. 90 below).

The axioms we consider here are just the bare minimum of causal perturbation theory, sufficient to imply that the induced perturbative quantum observables organize into a causally local net of quantum observables (discussed below).

In applications one considers further axioms, in particular compatibility of the S-matrix with spacetime symmetry. This is needed for the proof of the main theorem of perturbative renormalization (see below).

Definition

(perturbative S-matrix)

Let 𝒲\mathcal{W} be a Wick algebra encoding the quantization of free fields in EE, with

loc:():𝒲 \mathcal{F}_{loc} \overset{:(-):}{\longrightarrow} \mathcal{W}

the quantization map (def. \ref{CompactlySupportedPolynomialLocalDensities}).

Then a Lagrangian S-matrix for fields of type EE perturbing the free fields encoded by 𝒲\mathcal{W}, is a functional

S: loc(g,j)𝒲[[g/,j/]] S \;\colon\; \mathcal{F}_{loc}\otimes ( \langle g,j\rangle ) \longrightarrow \mathcal{W}[ [ g/\hbar, j/\hbar ] ]

(on local observables (def. \ref{CompactlySupportedPolynomialLocalDensities}) times the coupling constant gg or source strength jj with values in the algebra of formal power series in the formal variables g/g/\hbar and j/j/\hbar in the given Wick algebra) such that the following conditions hold for fixed L int,{J n} n=1 NL_{int}, \{ J_n\}_{n = 1}^N:

  1. (perturbation)

    There exist distributions (multi- linear continuous functionals) of the form

    T:( loc(g,j)) k𝒲[[g/,j/]] T \;\colon\; (\mathcal{F}_{loc} \otimes ( \langle g,j \rangle ) )^{\otimes^k} \longrightarrow \mathcal{W}[ [ g/\hbar, j/\hbar ] ]

    for all kk \in \mathbb{N}, such that:

    1. The unary operation is the quantization map

      T(L+A)=:L:+:A: T(L + A) = :L: + :A:
    2. The S-matrix is the exponential of “time-ordered products” in that for L,A locL, A \in \mathcal{F}_{loc}

      S(gL+jA) =T(exp (1i(gL+jA))) k=01k!1(i) kT((gL+jA)(gL+jA)karguments) \begin{aligned} S( g L + j A ) & = T\left( \exp_{\otimes}\left( \tfrac{1}{i \hbar} \left( g L + j A\right) \right) \right) \\ & \coloneqq \underoverset{k = 0}{\infty}{\sum} \tfrac{1}{k!} \frac{1}{(i \hbar)^k} T(\underset{k\, \text{arguments}}{\underbrace{ (g L + j A) \cdots (g L + j A) }}) \end{aligned}
  2. (normalization)

    S(0)=1 S(0) = 1
  3. (causal additivity)

For all J 1,J 2,L locJ_1, J_2, L \in \mathcal{F}_{loc} we have

(supp(J 1)supp(J 2))(L loc(S(L+J 1+J 2)=S(L+J 1)S(L) 1S(L+J 2))). \left( supp(J_1) \geq supp(J_2) \right) \;\; \Rightarrow \;\; \left( \underset{L \in \mathcal{F}_{loc}}{\forall} \left( S(L + J_1 + J_2) = S(L + J_1) S(L)^{-1} S(L + J_2) \right) \right) \,.

Given such perturbative SS-matrix, then we say that the generating function (for quantum observables, see def. 132 below) that it induces is the functional

(152)Z: locg× locj𝒲[[g/]][[j/]] Z \;\colon\; \mathcal{F}_{loc} \langle g \rangle \times \mathcal{F}_{loc} \langle j \rangle \longrightarrow \mathcal{W}[ [ g/\hbar] ][ [ j/\hbar ] ]

given by

Z g swL int(j swA)S(g swL int) 1S(g swL int+j swA). Z_{g_{sw}L_{int}}(j_{sw}A) \;\coloneqq\; S(g_{sw}L_{int})^{-1} S( g_{sw}L_{int} + j_{sw}A ) \,.

Def. 128 is due to (Epstein-Glaser 73 (1)) (in view of prop. 92 below), except that these authors remain a little vague on the nature of the domain. The domain loc\mathcal{F}_{loc} is made explicit (in terms of axioms for the time-ordered products, see def. 129 below), in (Brunetti-Fredenhagen 99, section 3, DütschFredenhagen 04, appendix E, Hollands-Wald 04, around (20)); for review see (Rejzner 16, around def. 6.7).

Remark

(further axioms)

The list of axioms in def. 128, similarly those for the time-ordered products below in def. 129, is just the bare minimum which implies that the corresponding quantum observables organize into a causally local net (discussed below). In applications such as in discussion of renormalization (below) one considers further axioms, such a unitarity and compatibility with spacetime symmetry.

Remark

(invertibility of the perturbative S-matrix)

The mutliplicative inverse S() 1S(-)^{-1} of the perturbative S-matrix in def. 128 always exists: By the axioms “perturbation” and “normalization” this follows with the usual formula for the multiplicative inverse of formal power series that are non-vanishing in degree 0:

If we write

S(gL+jA)1+D(gL+jA) S(g L + j A) \coloneqq 1 + D(g L + j A)

then

(153)S(gL+jA) 1 =(1+D(jL+jA)) 1 =r=0(D(gL+jA)) r =r=0(1S(gL+jA)) r, \begin{aligned} S(g L + j A)^{-1} &= (1 + D(j L + j A))^{-1} \\ & = \underoverset{r = 0}{\infty}{\sum} (-D(g L + j A))^r \\ & = \underoverset{r = 0}{\infty}{\sum} (1 - S(g L + j A))^r \,, \end{aligned}

where the last sum does exist in 𝒲[[g/,j/]]\mathcal{W}[ [ g/\hbar, j/\hbar] ] because by the axiom “normalization” D(L)D(L) has vanishing coefficient in zeroth order, so that only a finite sub-sum of the formal infinite sum contributes in each order.

Remark

(intuitive interpretation of the perturbative S-matrix as a “path integral”)

In traditional informal discussion of perturbative quantum field theory, the S-matrix from def. 128 is thought of as a “path integral”, written

S(gL int+j)=not really!ΦΓ Σ(E) asmptexp( X(giL int(Φ)+jA(Φ)))e 1i XL free(Φ)D[Φ], S\left( \tfrac{g}{\hbar} L_{int} + j \right) \;\overset{\text{not really!}}{=}\; \underset{\Phi \in \Gamma_\Sigma(E)_{asmpt}}{\int} \exp\left( \int_X \left( \tfrac{g}{i \hbar} L_{int}(\Phi) + j A(\Phi) \right) \right) e^{\tfrac{1}{i \hbar}\int_X L_{free}(\Phi) }D[\Phi] \,,

where the integration is thought to be over the space of field histories Γ Σ(E) asmpt\Gamma_\Sigma(E)_{asmpt} (“field paths”, example 16) which satisfy given asymptotic conditions at x 0±x^0 \to \pm \infty; and as these boundary conditions vary the above is regarded as an integral kernel that defines the required operator in 𝒲\mathcal{W} (e.g. Weinberg 95, around (9.3.10) and (9.4.1)).

Here the local density g swL intg_{sw}L_{int} has the interpretation of an interaction Lagrangian density L intL_{int} adiabatically switched by a spacetime-dependent coupling “constant”, and j swj_{sw} has the interpretation of a source field strength.

On the other hand, the kinetic or free field Lagrangian L freeL_{free}, which in the axiomatic description of def. 128 is implicit in the Wick algebra 𝒲\mathcal{W} is interpreted as determining the would-be Gaussian measuree 1i XL free(Φ)D[Φ]e^{\tfrac{1}{i \hbar}\int_X L_{free}(\Phi) }D[\Phi]” for the path integral.

Since this measure does not actually exist, in general (or is not known to exist), we may instead think of the axioms for the S-matrix in def. 128 as rigorously defining the path integral, not as an actual integration, but “synthetically” by characterizing the behaviour of the result of the would-be integration.

See also remark 29 below.

Definition 128 suggests to focus on the multilinear operations T(...)T(...) which define the perturbative S-matix order-by-order:

Definition

(time-ordered product)

Let 𝒲\mathcal{W} be a Wick algebra encoding the quantization of free fields in EE (def. \ref{CompactlySupportedPolynomialLocalDensities}).

A time-ordered product is a sequence of distributions (multi- linear continuous functionals) of the form

T k: loc k𝒲[[g/]] T_k \;\colon\; \mathcal{F}_{loc}^{\otimes^k} \longrightarrow \mathcal{W}[ [ g/\hbar ] ]

for all kk \in \mathbb{N}, such that:

  1. (perturbation) T 1(gL+jA)=:gL:+:jA:T_1(g L + j A) = :g L: + :j A:

  2. (normalization) T 0=1T_0 = 1

  3. (symmetry) each T kT_k is symmetric in its arguments

  4. (causal factorization) If supp(L 1)supp(L r)supp(L r+1)supp(L k)supp(L_1) \cup \cdots \cup supp(L_r) \;\geq\; supp(L_{r+1}) \cup \cdots \cup supp(L_k) then

    T((gL 1+jA 1)(gL k+jA k))=T((gL 1+jA 1)(gL r+jA r))T((gL r+1+jA r+1)(gL k+jA k)). T((g L_1 + j A_1) \cdots (g L_k + j A_k) ) = T( (g L_1 + j A_1) \cdots (g L_r + j A_r) ) T( (g L_{r+1} + j A_{r+1}) \cdots ( g L_k + j A_{k} ) ) \,.
Definition

(notation for time-ordered products as generalized functions)

It will be convenient (as in Epstein-Glaser 73) to think of the time-ordered products, being operator-valued distributions, as generalized functions with dependence on spacetime points:

Σ r+sT L 1,,L r,A 1,,A s(x 1,,x r,y 1,,y s)g sw,1(x 1)g sw,r(x r)j sw,1(y 1)j sw,s(y s)dvol Σ(x 1,,x r,y 1,,y s) T(g sw,1L 1g sw,kLj sw,1A 1j sw,sA s). \begin{aligned} & \int_{\Sigma^{r+s}} T_{L_1, \cdots, L_r, A_1, \cdots, A_s}(x_1, \cdots, x_{r}, y_1, \cdots, y_s) g_{sw,1}(x_1) \cdots g_{sw, r}(x_r) j_{sw,1}(y_1) \cdots j_{sw,s}(y_s) dvol_\Sigma(x_1, \cdots, x_r, y_1, \cdots, y_s) \\ & \coloneqq T( g_{sw,1} L_1 \cdots g_{sw,k} L \cdot j_{sw,1} A_1 \cdots j_{sw,s}A_s ) \end{aligned} \,.

Moreover, the subscripts on these generalized functions will always be clear from the context, so that in computations we will notationally suppress these.

Finally, due to the “symmetry” axiom in def. 129, a time-ordered product depends only on its set of arguments, not on the order of the arguments. We will write X{x 1,,x r}\mathbf{X} \coloneqq \{x_1, \cdots, x_r\} and Y{y 1,y r}\mathbf{Y} \coloneqq \{y_1, \cdots y_r\} for sets of spacetime points, and hence abbreviate the expression for the “value” of the generalized function in the above as T(X,Y)T(\mathbf{X}, \mathbf{Y}) etc.

In this condensed notation the above reads

Σ r+sT(X,Y)g sw,1(x 1)g sw,r(x r)j sw,1(y 1)j sw,s(y s)dvol Σ(X,Y). \int_{\Sigma^{r+s}} T(\mathbf{X}, \mathbf{Y}) \, g_{sw,1}(x_1) \cdots g_{sw, r}(x_r) \, j_{sw,1}(y_1) \cdots j_{sw,s}(y_s) dvol_\Sigma(\mathbf{X},\mathbf{Y}) \,.

This condensed notation turns out to be greatly simplify computations, as it absorbs all the “relative” combinatorial prefactors:

Example

(product of perturbation series in generalized function notation)

Let

U(g)=n=01n!U(x 1,,x n)g(x 1)g(x n)dvol U(g) = \underoverset{n = 0}{\infty}{\sum} \frac{1}{n!} \int U(x_1, \cdots, x_n) \, g(x_1) \cdots g(x_n) \, dvol

and

V(g)=n=01n!V(x 1,,x n)g(x 1)g(x n)dvol V(g) = \underoverset{n = 0}{\infty}{\sum} \frac{1}{n!} \int V(x_1, \cdots, x_n) \, g(x_1) \cdots g(x_n) \, dvol

be power series of distributions in formal power series in g/g/\hbar as in def. 130. Then the product W(g)U(g)V(g)W(g) \coloneqq U(g) V(g) with expansion

W(g)=n=01n!W(x 1,,x n)g(x 1)g(x n)dvol W(g) = \underoverset{n = 0}{\infty}{\sum} \frac{1}{n!} \int W(x_1, \cdots, x_n) \, g(x_1) \cdots g(x_n) \, dvol

is given simply by

W(X)=IXU(I)V(XI). W(\mathbf{X}) \;=\; \underset{\mathbf{I} \subset \mathbf{X}}{\sum} U(\mathbf{I}) V(\mathbf{X} \setminus \mathbf{I}) \,.

(Epstein-Glaser 73 (5))

This is because for fixed cardinality |I|=n 1{\vert \mathbf{I} \vert} = n_1 this sum over all subsets IX\mathbf{I} \subset \mathbf{X} overcounts the sum over partitions of the coordinates as (x 1,x n 1,x n 1+1,x n)(x_1, \cdots x_{n_1}, x_{n_1 + 1}, \cdots x_n) precisely by the binomial coefficient n!n 1!(nn 1)!\frac{n!}{n_1! (n - n_1) !}. Here the factor of n!n! cancels against the “global” combinatorial prefactor in the above expansion of W(g)W(g), while the remaining factor 1n 1!(nn 1)!\frac{1}{n_1! (n - n_1) !} is just the “relative” combinatorial prefactor seen at total order nn when expanding the product U(g)V(g)U(g)V(g).

Remark

(the traditional error that leads to the notorious divergencies)

Naively it might seem that the time-ordered products of def. 129 are given simply by multiplication with step functions, in the notation as generalized functions (def. 130):

T(x 1,x 2)=no!θ(x 1 0x 2 0)T(x 1)T(x 2)+θ(x 2 0x 1 0)T(x 2)T(x 1) T(x_1, x_2) \overset{\text{no!}}{=} \theta(x_1^0 - x_2^0) T(x_1) T(x_2) + \theta(x_2^0 - x_1^0) T(x_2) T(x_1)

etc. (for instance Weinberg 95, p. 143, between (3.5.9) and (3.5.10)).

This however is simply a mathematical error, in general: Both T(,)T(-,-) as well as θ\theta are distributions and their product of distributions is in general not defined. The notorious “divergencies which plague quantum field theory” are the signature of this ill defined operation.

On the other hand, when both distributions are restricted to the complement of the diagonal (i.e. restricted away from x 1=x 2x_1 = x_2) then the above expression happens to be well defined and does solve the axioms for time-ordered products.

Hence what needs to be done to properly define the time-ordered product is to choose an extension of distributions of the above expression from the complement of the diagonal to the diagonal. Any such extension will produce time-ordered products. There are in general several different such extensions. This freedom of choice is the freedom of renormalization; or equivalently, by the main theorem of perturbative renormalization theory, this is the freedom of choosing “counter terms” for the local interaction. This we discuss below in Feynman diagrams and (re-)normalization.

In order to prove that the axioms for time-ordered products do imply those for a perturbative S-matrix (prop. 91 below) we need to consider the corresponding reverse-time ordere products:

Definition

(reverse-time ordered product)

Given a time-ordered product T={T k} kT = \{T_k\}_{k \in \mathbb{N}} (def. 129), its reverse-time ordered product

T¯ k: loc k𝒲[[g/]] \overline{T}_k \;\colon\; \mathcal{F}_{loc}^{\otimes^k} \longrightarrow \mathcal{W}[ [ g/\hbar ] ]

for kk \in \mathbb{N} is defined by

T¯(L 1L n){r=1n(1) rσUnshuffl(n,r)T(L σ(1)L σ(k 1))T(L σ(k 1+1)L σ(k 2))T(L σ(k r1+1)L σ k r) | k1 1 | k=0, \overline{T}( L_1 \cdots L_n ) \;\coloneqq\; \left\{ \array{ \underoverset{r = 1}{n}{\sum} (-1)^r \underset{\sigma \in Unshuffl(n,r)}{\sum} T( L_{\sigma(1)} \cdots L_{\sigma(k_1)} ) \, T( L_{\sigma(k_1 + 1)} \cdots L_{\sigma(k_2)} ) \cdots T( L_{\sigma(k_{r-1}+1)} \cdots L_{\sigma_{k_r}} ) &\vert& k \geq 1 \\ 1 &\vert& k = 0 } \right. \,,

where the sum is over all unshuffles σ\sigma of (1n)(1 \leq \cdots \leq n) into rr non-empty ordered subsequences. Alternatively, as a generalized function as in def. 130, this reads

T¯(X)=r=1|X|(1) rI 1,,I r jk(I jI k=) I 1I r=XT(I 1)T(I r) \overline{T}( \mathbf{X} ) = \underoverset{r = 1}{{\vert \mathbf{X} \vert}}{\sum} (-1)^r \underset{ \array{ \mathbf{I}_1, \cdots, \mathbf{I}_r \neq \emptyset \\ \underset{j \neq k}{\forall}\left( \mathbf{I}_j \cap \mathbf{I}_k = \emptyset \right) \\ \mathbf{I}_1 \cup \cdots \cup \mathbf{I}_r = \mathbf{X} } }{\sum} T( \mathbf{I}_1 ) \cdots T(\mathbf{I}_r)

(e.g. Epstein-Glaser 73 (11))

Proposition

(reverse-time ordered products express inverse S-matrix)

Given a time-ordered products T()T(-) (def. 129), then the corresponding reverse time-ordered product T¯()\overline{T}(-) (def. 131) expresses the inverse S() 1S(-)^{-1} (according to remark 26) of the corresponding perturbative S-matrix S(L)k1k!T(LLkargs)S(L) \coloneqq \underset{k \in \mathbb{N}}{\sum} \tfrac{1}{k!} T(\underset{k\,\text{args}}{\underbrace{L \cdots L}}):

S(L) 1=k1k!T¯(LLkargs). S(L)^{-1} = \underset{k \in \mathbb{N}}{\sum} \tfrac{1}{k!} \overline{T}( \underset{k \, \text{args}}{\underbrace{L \cdots L}} ) \,.
Proof

By definition we have

k1k!T¯(LLkargs)=k1k!r=1k(1) rσUnshuffl(k,r)T(L σ(1)L σ(k 1))T(L σ(k 1+1)L σ(k 2))T(L σ(k r1+1)L σ k r) \underset{k \in \mathbb{N}}{\sum} \tfrac{1}{k!} \overline{T}( \underset{k \, \text{args}}{\underbrace{L \cdots L}} ) = \underset{ k \in \mathbb{N}}{\sum} \tfrac{1}{k!} \underoverset{r = 1}{k}{\sum} (-1)^r \underset{\sigma \in Unshuffl(k,r)}{\sum} T( L_{\sigma(1)} \cdots L_{\sigma(k_1)} ) T( L_{\sigma(k_1 + 1)} \cdots L_{\sigma(k_2)} ) \cdots T( L_{\sigma(k_{r-1}+1)} \cdots L_{\sigma_{k_r}} )

where k{1,,n}L kL\underset{k \in \{1 , \cdots, n\}}{\forall} L_k \coloneqq L.

If instead of unshuffles (i.e. partitions into non-empty subsequences preserving the original order) we took partitions into arbitrarily ordered subsequences, we would be overcounting by the factorial of the length of the subsequences, and hence the above may be equivalently written as:

=k1k!r=1k(1) rσΣ(k)k 1++k r=ki(k i1)1k 1!1k r!T(L σ(1)L σ(k 1))T(L σ(k 1+1)L σ(k 2))T(L σ(k r1+1)L σ k r), \cdots = \underset{k \in \mathbb{N}}{\sum} \tfrac{1}{k!} \underoverset{r = 1}{k}{\sum} (-1)^r \underset{ {\sigma \in \Sigma(k)} \atop { { k_1 + \cdots + k_r = k } \atop { \underset{i}{\forall} (k_i \geq 1) } } }{\sum} \tfrac{1}{k_1!} \cdots \tfrac{1}{k_r !} \, T( L_{\sigma(1)} \cdots L_{\sigma(k_1)} ) \, T( L_{\sigma(k_1 + 1)} \cdots L_{\sigma(k_2)} ) \cdots T( L_{\sigma(k_{r-1}+1)} \cdots L_{\sigma_{k_r}} ) \,,

where Σ(k)\Sigma(k) denotes the symmetric group (the collection of all permutations of kk elements).

Moreover, since all the L kL_k are equal, the sum is in fact independent of σ\sigma, it only depends on the length of the subsequences. Since there are k!k! permutations of kk elements the above reduces to

=kr=1k(1) rk 1++k r=k1k 1!1k r!T(LLk 1factors)T(LLk 2factors)T(LLk rfactors) =r=0(k=0T(LLkfactors)) r =S(L) 1, \begin{aligned} \cdots & = \underset{k \in \mathbb{N}}{\sum} \underoverset{r = 1}{k}{\sum} (-1)^r \underset{ k_1 + \cdots + k_r = k }{\sum} \tfrac{1}{k_1!} \cdots \tfrac{1}{k_r !} T( \underset{k_1 \, \text{factors}}{\underbrace{ L \cdots L }} ) T( \underset{k_2 \, \text{factors}}{\underbrace{ L \cdots L }} ) \cdots T( \underset{k_r \, \text{factors}}{\underbrace{ L \cdots L }} ) \\ & = \underoverset{\infty}{r = 0}{\sum} \left( - \underoverset{k = 0}{\infty}{\sum} T ( \underset{k\,\text{factors}}{\underbrace{L \cdots L}} ) \right)^r \\ & = S(L)^{-1} \,, \end{aligned}

where in the last line we used (153).

In fact prop. 88 is a special case of the following more general statement:

Proposition

(inversion relation for reverse-time ordered products)

Let {T k} k\{T_k\}_{k \in \mathbb{N}} be time-ordered products according to def. 129. Then the reverse-time ordered products according to def. 131 satisfies the following inversion relation for all X\mathbf{X} \neq \emptyset (in the condensed notation of def. 130)

JXT(J)T¯(XJ)=0 \underset{\mathbf{J} \subset \mathbf{X}}{\sum} T(\mathbf{J}) \overline{T}(\mathbf{X} \setminus \mathbf{J}) \;=\; 0

and

JXT¯(XJ)T(J)=0 \underset{\mathbf{J} \subset \mathbf{X}}{\sum} \overline{T}(\mathbf{X} \setminus \mathbf{J}) T(\mathbf{J}) \;=\; 0
Proof

This is immediate from unwinding the definitions.

Proposition

(reverse causal factorization of reverse-time ordered products)

Let {T k} k\{T_k\}_{k \in \mathbb{N}} be time-ordered products according to def. 129. Then the reverse-time ordered products according to def. 131 satisfies reverse-causal factorization.

(Epstein-Glaser 73, around (15))

Proof

In the condensed notation of def. 130, we need to show that for X=PQ\mathbf{X} = \mathbf{P} \cup \mathbf{Q} with PQ=\mathbf{P} \cap \mathbf{Q} = \emptyset then

(PQ)(T¯(X)=T¯(Q)T¯(P)). \left( \mathbf{P} \geq \mathbf{Q} \right) \;\Rightarrow\; \left( \overline{T}(\mathbf{X}) = \overline{T}(\mathbf{Q}) \overline{T}(\mathbf{P}) \right) \,.

We proceed by induction. If |X|=1{\vert \mathbf{X}\vert} = 1 the statement is immediate. So assume that the statement is true for sets of cardinality n1n \geq 1 and consider X\mathbf{X} with |X|=n+1{\vert \mathbf{X}\vert} = n+1.

We make free use of the condensed notation as in example 113.

From the formal inversion

JXT¯(J)T(XJ)=0 \underset{\mathbf{J} \subset \mathbf{X}}{\sum} \overline{T}(\mathbf{J}) T(\mathbf{X}\setminus \mathbf{J}) = 0

(which uses the induction assumption that |X|1{\vert \mathbf{X}\vert} \geq 1) it follows that

T¯(X) =JXJXT¯(J)T(XJ) =JJ=XJJ=JT¯(QJ)T¯(PJ)T(P(J))T(Q(J)) =LL=Q,LL=LT¯(L)(KPT¯(K)T(PK))=0T(L)T¯(Q)KPKT¯(K)T(PK)=T¯(P) =T¯(Q)T¯(P). \begin{aligned} \overline{T}(\mathbf{X}) & = - \underset{ { \mathbf{J} \subset \mathbf{X} } \atop { \mathbf{J} \neq \mathbf{X} } }{\sum} \overline{T}(\mathbf{J}) T( \mathbf{X} \setminus \mathbf{J} ) \\ & = - \underset{ { \mathbf{J} \cup \mathbf{J}' = \mathbf{X} } \atop { { \mathbf{J} \cap \mathbf{J}' = \emptyset } \atop { \mathbf{J}' \neq \emptyset } } }{\sum} \overline{T}( \mathbf{Q} \cap \mathbf{J} ) \overline{T}( \mathbf{P} \cap \mathbf{J} ) T ( \mathbf{P} \cap ( \mathbf{J}' ) ) T ( \mathbf{Q} \cap ( \mathbf{J}' ) ) \\ & = - \underset{ { \mathbf{L} \cup \mathbf{L}' = \mathbf{Q} \,,\, \mathbf{L} \cap \mathbf{L}' = \emptyset } \atop { \mathbf{L}' \neq \emptyset } }{\sum} \overline{T}( \mathbf{L} ) \underset{ = 0}{ \underbrace{ \left( \underset{ \mathbf{K} \subset \mathbf{P} }{\sum} \overline{T}( \mathbf{K} ) T( \mathbf{P} \setminus \mathbf{K}) \right) } } T(\mathbf{L'}) - \overline{T}(\mathbf{Q}) \underset{ = - \overline{T}(\mathbf{P}) }{ \underbrace{ \underset{ {\mathbf{K} \subset \mathbf{P}} \atop { \mathbf{K} \neq \emptyset } }{\sum} \overline{T}(\mathbf{K}) T (\mathbf{P} \setminus \mathbf{K} ) }} \\ & = \overline{T}(\mathbf{Q}) \overline{T}(\mathbf{P}) \end{aligned} \,.

Here

  1. in the second line we used that X=QP\mathbf{X} = \mathbf{Q} \sqcup \mathbf{P}, together with the causal factorization property of T()T(-) (which holds by general assumption) and that of T¯()\overline{T}(-) (which holds by the induction assumption, using that JX\mathbf{J} \neq \mathbf{X} hence that |J|<|X|{\vert \mathbf{J}\vert} \lt {\vert \mathbf{X}\vert}).

  2. in the third line we decomposed the sum over J,JX\mathbf{J}, \mathbf{J}' \subset \mathbf{X} into two sums over subsets of Q\mathbf{Q} and P\mathbf{P}:

    1. The first summand in the third line is the contribution where J\mathbf{J}' has a non-empty intersection with Q\mathbf{Q}. This makes K\mathbf{K} range without constraint, and therefore the sum in the middle vanishes, as indicated, as it is the contribution at order |Q|{\vert \mathbf{Q}\vert} of the inversion formula from prop. 89

    2. The second summand in the third line is the contribution where J\mathbf{J}' does not intersect Q\mathbf{Q}. Now the sum over K\mathbf{K} is the inversion formula from prop. 89 except for one term, and so it equals that term.

Using these facts about the reverse-time ordered products, we may finally prove that time-ordered products indeed do induced a perturbative S-matrix:

Proposition

(time-ordered products induce perturbative S-matrix)

Let {T k} k\{T_k\}_{k \in \mathbb{N}} be a system of time-ordered products according to def. 129. Then

S() Texp(1i()) k1(i) k1k!T(kfactors) \begin{aligned} S(-) & \coloneqq T \exp\left( \tfrac{1}{i \hbar}(-) \right) \\ & \coloneqq \underset{k \in \mathbb{N}}{\sum} \tfrac{1}{(i \hbar)^k} \tfrac{1}{k!} T( \underset{k \, \text{factors}}{\underbrace{- \cdots -}} ) \end{aligned}

is indeed a perturbative S-matrix according to def. 128.

Proof

The axiom “perturbation” and “normalization” for the S-matrix are immediate from the corresponding axioms of the time-ordered products. What requires proof is that causal additivity of the S-matrix follows from the causal factorization property of the time-ordered products.

Notice that also the simple causal factorization property of the S-matrix

(supp(g sw 1L 1)supp(g sw,L 2))(S(g sw,1L 1+g sw,2L 2)=S(g sw,1L 1)S(g sw,2L 2)) (supp(g_{sw_1}L_1) \geq supp(g_{sw,}L_2)) \;\Rightarrow\; \left( S(g_{sw,1}L_1 + g_{sw,2}L_2) = S(g_{sw,1}L_1) S(g_{sw,2}L_2) \right)

is immediate from the time-ordering axiom of the time-ordered products.

But causal additivity is stronger. It is remarkable that this, too, follows from just the time-ordering (Epstein-Glaser 73, around (73)):

To see this, first expand the generating functional ZZ (152) into powers of (g/)(g/\hbar) and (j/)(j/\hbar)

Z L(L+A)=n,m=01n!m!R(LLnfactors,(AAmfactors)) Z_{L}(L + A) \;=\; \underoverset{n,m = 0}{\infty}{\sum} \frac{1}{n! m!} R( \underset{n\, \text{factors}}{\underbrace{L \cdots L}}, ( \underset{m \, \text{factors}}{ \underbrace{ A \cdots A } } ) )

and then compare order-by-order with the given time-ordered product TT and its induced reverse-time ordered product (def. 131) via prop. 88. (These R(,)R(-,-) are also called the “generating retarded products, discussed in their own right around def. 133 below.)

In the condensed notation of def. 130 and its way of absorbing combinatorial prefactors as in example 113 this yields at order (g/) |Y|(j/) |X|(g/\hbar)^{\vert \mathbf{Y}\vert} (j/\hbar)^{\vert \mathbf{X}\vert} the coefficient

(154)R(Y,X)=IYT¯(I)T((YI),X). R(\mathbf{Y}, \mathbf{X}) \;=\; \underset{\mathbf{I} \subset \mathbf{Y}}{\sum} \overline{T}(\mathbf{I}) T( (\mathbf{Y} \setminus \mathbf{I}) , \mathbf{X} ) \,.

We claim now that the support of RR is inside the subset for which Y\mathbf{Y} is in the causal past of X\mathbf{X}. This will imply the claim, because by multi-linearity of R(,)R(-,-) it then follows that

(J 1J 2)(Z L+J 1(J 2)=Z L(J 2)) \left(J_1 \geq J_2\right) \Rightarrow \left( Z_{L + J_1}(J_2) = Z_L(J_2) \right)

and by prop. 92 this is equivalent to causal additivity of the S-matrix.

It remains to prove the claim:

Consider X,YΣ\mathbf{X}, \mathbf{Y} \subset \Sigma such that the subset PY\mathbf{P} \subset \mathbf{Y} of points not in the past of X\mathbf{X} (def. 32), hence the maximal subset with

PX, \mathbf{P} \geq \mathbf{X} \,,

is non-empty. We need to show that in this case R(Y,X)=0R(\mathbf{Y}, \mathbf{X}) = 0 (in the sense of generalized functions).

Write QYP\mathbf{Q} \coloneqq \mathbf{Y} \setminus \mathbf{P} for the complementary set of points, so that all points of Q\mathbf{Q} are in the past of X\mathbf{X}. Notice that this implies that P\mathbf{P} is also not in the past of Q\mathbf{Q}:

PQ. \mathbf{P} \geq \mathbf{Q} \,.

With this decomposition of Y\mathbf{Y}, the sum in (154) over subsets I\mathbf{I} of Y\mathbf{Y} may be decomposed into a sum over subsets J\mathbf{J} of P\mathbf{P} and K\mathbf{K} of Q\mathbf{Q}, respectively. These subsets inherit the above causal ordering, so that by the causal factorization property of T()T(-) (def. 129) and T¯()\overline{T}(-) (prop. 90) the time-ordered and reverse time-ordered products factor on these arguments:

R(Y,X) =JPKQT¯(JK)T((PJ)(QK),X) =JPKQT¯(K)T¯(J)T(PJ)T(QK,X) =KQT¯(K)(JPT¯(J)T(PJ))=0T(QK,X). \begin{aligned} R(\mathbf{Y}, \mathbf{X}) & = \underset{ {\mathbf{J} \subset \mathbf{P}} \atop { \mathbf{K} \subset \mathbf{Q} } }{\sum} \, \overline{T}( \mathbf{J} \cup \mathbf{K} ) T( (\mathbf{P} \setminus \mathbf{J}) \cup (\mathbf{Q} \setminus \mathbf{K}), \mathbf{X} ) \\ & = \underset{ {\mathbf{J} \subset \mathbf{P}} \atop { \mathbf{K} \subset \mathbf{Q} } }{\sum} \, \overline{T}( \mathbf{K} ) \overline{T}( \mathbf{J} ) T( \mathbf{P} \setminus \mathbf{J} ) T( \mathbf{Q} \setminus \mathbf{K}, \mathbf{X} ) \\ & = \underset{ \mathbf{K} \subset \mathbf{Q} }{\sum} \overline{T}(\mathbf{K}) \underset{= 0}{ \underbrace{ \left( \underset{\mathbf{J} \subset \mathbf{P}}{\sum} \overline{T}(\mathbf{J}) T( \mathbf{P} \setminus \mathbf{J} ) \right) }} T(\mathbf{Q} \setminus \mathbf{K}, \mathbf{X}) \end{aligned} \,.

Here the sub-sum in brackets vanishes by the inversion formula, prop. 89.

\,

Quantum observables

A genuine local observable should depend on the values of the fields on some compact subset of spacetime. Moreover, a perturbative quantum observable should be a power series in Planck's constant \hbar, reducing to the corresponding classical observable at =0\hbar = 0. The perturbative S-matrix axiomatized above is neither localized in spacetime this way, nor is it a power series in \hbar (it is a Laurent series in \hbar). So it is not a local observable. But the actual quantum observables on interacting fields may be expressed in terms of the S-matrix by Bogoliubov's formula (def. 132 below).

This formula is consistent in that it implies that local observables form a causally local net as their spacetime support varies (this is prop. 94 below). (On deeper grounds, this formula turns out to yield the formal Fedosov deformation quantization of the interacting field theory (Collini 16).)

Namely a key consequence of the “causal additivity” axiom on the S-matrix in def. 128 turns out to be that the perturbative quantum observables on interacting fields with compact spacetime support (def. 132)

  1. depend on the adiabatic switching g swg_{sw} of the interaction Lagrangian density only up to canonical unitary isomorphism (prop. 92 below)

  2. form a causally local net of observables in the sense of the Haag-Kastler axioms as the spacetime localization varies (prop. 94 below).

To the extent that a local net of observables may be regarded as defining a quantum field theory, which is the claim of (perturbative) AQFT, this proves that the perturbative S-matrices in causal perturbation theory as in def. 128 indeed make sense, despite the involvement of adiabatic switching of the interaction Lagrangian density which does not make physical sense when interpreted naively: In reality the interaction is of course not (for realistic theories at least) “switched off” outside some bounded region of spacetime; but the result here shows that if we pretend that it does then first of all we get consistent mathematical formulas and moreover we can then nevertheless compute the correct quantum observables that are localized in this spacetime region. But the local net of observables as the spacetime localization varies is supposed to encode the full quantum field theory. Certainly any given experiment in practice probes a bounded spacetime region, and hence the algebra of observables localized in this region is sufficient to compare the theory to experiment.

\,

Definition

(perturbative quantum observables on interacting fields via Bogoliubov's formula)

Let SS be a perturbative S-matrix as in def. 128, and g swL int locgg_{sw} L_{int} \in \mathcal{F}_{loc}\langle g\rangle an adiabatically switched interaction Lagrangian density.

Then for A locA \in \mathcal{F}_{loc} a local observable, the perturbative quantum observable A^\widehat{A} corresponding to AA is the operator-valued distribution

A^:C cp (Σ)𝒲[[g]][[]] \widehat{A} \;\colon\; C^\infty_{cp}(\Sigma) \longrightarrow \mathcal{W}[ [ g ] ][ [ \hbar ] ]

which is the derivative of the generating functional ZZ ((152) in def. 128) at vanishing source field:

A^(j)iddϵZ g swL int(ϵjA)| ϵ=0. \widehat{A}(j) \;\coloneqq\; - i \hbar \frac{d}{d \epsilon} Z_{g_{sw} L_{int}}( \epsilon j A)\vert_{\epsilon = 0} \,.

This definition of A^\widehat{A} without the adiabatic switching g swg_{sw} is originally due to Bogoliubov-Shirkov 59heory#BogoliubovShirkov59), nowadays sometimes called Bogoliubov's formula (e.g. Rejzner 16 (6.12)). The version with adiabatic switching is due to (Epstein-Glaser 73 around (74)). Review includes (Dütsch-Fredenhagen 00, around (17)).

Remark

(intuitive interpretation of Bogoliubov's formula in terms of a “path integral”)

With the perturbative S-matrix intuitively thought of as a “path integral as in remark 27

S(gL int+jA)=not really!exp( X(giL int(Φ)+jA(Φ)))e 1i XL free(Φ)D[Φ] S(\tfrac{g}{\hbar} L_{int} + j A) \;\overset{\text{not really!}}{=}\; \int \exp\left( \int_X \left( \tfrac{g}{i \hbar} L_{int}(\Phi) + j A(\Phi) \right) \right) e^{\tfrac{1}{i \hbar}\int_X L_{free}(\Phi) }D[\Phi]

the Bogoliubov formula in def. 132 similarly would have the following heuristic interpretation:

A^(j)=not really!jA(Φ)exp( X(giL int(Φ)))e 1i XL free(Φ)D[Φ]exp( X(giL int(Φ)))e 1i XL free(Φ)D[Φ] \widehat A(j) \;\overset{\text{not really!}}{=}\; \frac{ \int j A(\Phi) \exp\left( \int_X \left( \tfrac{g}{i \hbar} L_{int}(\Phi) \right) \right) e^{\tfrac{1}{i \hbar}\int_X L_{free}(\Phi) }D[\Phi] } { \int \exp\left( \int_X \left( \tfrac{g}{i \hbar} L_{int}(\Phi) \right) \right) e^{\tfrac{1}{i \hbar}\int_X L_{free}(\Phi) }D[\Phi] }

If here we were to regard the expression

μ(Φ)not reallyexp( X(giL int(Φ)))e 1i XL free(Φ)D[Φ]exp( X(giL int(Φ)))e 1i XL free(ϕ)D[ϕ] \mu(\Phi) \;\overset{\text{not really}}{\coloneqq}\; \frac{ \exp\left( \int_X \left( \tfrac{g}{i \hbar} L_{int}(\Phi) \right) \right) e^{\tfrac{1}{i \hbar}\int_X L_{free}(\Phi) }D[\Phi] } { \int \exp\left( \int_X \left( \tfrac{g}{i \hbar} L_{int}(\Phi) \right) \right) e^{\tfrac{1}{i \hbar}\int_X L_{free}(\phi) }D[\phi] }

as a “complex probability measure” on the space of field histories (“field paths”), then this formula would express the expectation value of the functional AA under this measure:

A^(j)=not really![jA] μ=jA(Φ)μ(Φ). \widehat{A}(j) \overset{\text{not really!}}{=} [j A]_{\mu} = \int j A(\Phi) \mu(\Phi) \,.

The power series coefficients of the quantum observables on interacting fields are also called the retarded products. For the time being we mention these here just for completeness:

Definition

(retarded products induced from perturbative S-matrix)

It follows from the perturbation axiom in def. 128 that there is a system of continuous linear functionals

R:( locg) k( loc) l𝒲[[g/]][[j/]] R \;\colon\; \left(\mathcal{F}_{loc}\langle g\rangle\right)^{\otimes^k} \otimes (\mathcal{F}_{loc})^{\otimes^l} \longrightarrow \mathcal{W}[ [ g/\hbar] ] [ [ j/\hbar] ]

for all k,lk,l \in \mathbb{N} such that

Z g swL(j swA)=k,l1k!l!R(g swLg swLkarguments,j swAj swAlarguments). Z_{g_{sw} L}(j_{sw} A) = \underset{k,l \in \mathbb{N}}{\sum} \frac{1}{k! l!} R( \underset{k \,\text{arguments}}{\underbrace{ g_{sw} L \cdots g_{sw} L } }, \underset{l \; \text{arguments}}{\underbrace{ j_{sw} A \cdots j_{sw} A }} ) \,.

Similarly there is

R:( locg) k( locj)𝒲[[g]][[]] R \;\colon\; \left(\mathcal{F}_{loc}\langle g \rangle\right)^{\otimes^k} \otimes \left(\mathcal{F}_{loc}\langle j \rangle\right) \longrightarrow \mathcal{W}[ [ g ] ] [ [ \hbar ] ]

such that

A^(h)=k1k!r(g swL intg swL intkarguments,hA). \widehat{A}(h) = \underset{k \in \mathbb{N}}{\sum} \frac{1}{k!} r( \underset{k \,\text{arguments}}{\underbrace{g_{sw}L_{int} \cdots g_{sw}L_{int}}}, h A ) \,.

These are called the (generating) retarded products (Glaser-Lehmann-Zimmermann 57, Epstein-Glaser 73, section 8.1).

Direct axiomatization of the retarded products is due to (Dütsch-Fredenhagen 04), see (Collini 16, section 2.2).

\,

It is useful now to reformulate the causal additivity-property of the perturbative S-matrix in terms of the generating functions / retarded products:

Proposition

(causal locality of the perturbative S-matrix)

Let SS be a perturbative S-matrix according to def. 128 with ZZ the generating functional (152) it induces

  1. The following conditions are equivalent for all L,J 1,J 2 locL, J_1, J_2 \in \mathcal{F}_{loc}:

    1. Z L(J 1+J 2)=Z L(J 1)Z L(J 2)Z_L(J_1 + J_2) = Z_L(J_1) Z_L(J_2)

    2. Z L+J 1(J 2)=Z L(J 2)Z_{L + J_1}(J_2) = Z_L(J_2)

    3. S(L+J 1+J 2)=S(L+J 1)S(L) 1S(L+J 2)S(L + J_1 + J_2) = S(L + J_1) \, S(L)^{-1} \, S(L + J_2)

    Hence causal additivity in def. 128 implies that all these conditions hold if supp(J 1)supp(J 2)supp(J_1) \geq supp(J_2).

  2. If supp(J 1)supp(J_1) is spacelike separted from supp(J 1)supp(J_1), hence if the causal ordering (def 32) is supp(J 1)supp(J 2)supp(J_1) \geq supp(J_2) and supp(J 2)supp(J 1)supp(J_2) \geq supp(J_1) then

    Z L int(J 1)Z L int(J 2)=Z L int(J 2)J L int(J 1). Z_{L_{int}}(J_1) Z_{L_{int}}(J_2) = Z_{L_{int}}(J_2) J_{L_{int}}(J_1) \,.

    Similarly, if supp(L 1)supp(L 2)supp(L_1) \geq supp(L_2) and supp(L 2)supp(L 1)supp(L_2) \geq supp(L_1) then

    S(L 1)S(L 2)=S(L 2)S(L 1). S(L_1) \, S(L_2) = S(L_2) \, S(L_1) \,.
  3. If L 1| O=L 2| OL_1\vert_{O} = L_2\vert_{O} on a causally closed subset O d1,1O \subset \mathbb{R}^{d-1,1} then there exists an invertible K𝒲[[g/]]K \in \mathcal{W}[ [ g/\hbar] ] such that for all JJ with supp(J)Osupp(J) \subset O it relates Z L 1(J)Z_{L_1}(J) to Z L 2(J)Z_{L_2}(J) by conjugation:

    Z L 2(J)=K 1Z L 1(J)K. Z_{L_2}(J) = K^{-1} \, Z_{L_1}(J) \, K \,.
Proof

The equivalence of the three conditions in the first statement is immediate from the definitions:

Expanding out the definition of VV, the first expression is equivalent to

S(L) 1S(L+J 1+J 2)=S(L) 1S(L+J 1)S(L) 1S(L+J 2). S(L)^{-1} S(L + J_1 + J_2) = S(L)^{-1} S(L + J_1 ) S(L)^{-1} S(L + J_2) \,.

Multiplying both sides of this equation by S(L)S(L), shows that it is equivalent to the third clause.

Multiplying once more with S(L+J 1) 1S(L + J_1)^{-1} this third equation is seen to be equivalent to

S(L+J 1) 1S(L+J 1+J 2)=S(L) 1S(L+J 2) S(L + J_1)^{-1} S(L + J_1 + J_2) = S(L)^{-1} S(L + J_2)

which is equivalently the second clause, by definition of VV.

Now the first clause of the first item immediately implies the first clause of the second item.

Similarly, setting L=0L = 0 and J 1=L 1J_1 = L_1 and J 2=L 2J_2 = L_2 in the third clause of the first item it reduces to

(supp(L 1)supp(L 2))S(L 1+L 2)=S(L 1)S(L 2). \left( supp(L_1) \geq supp(L_2) \right) \;\Rightarrow\; S(L_1 + L_2) = S(L_1)S(L_2) \,.

Hence if supp(L 1)supp(L 2)supp(L_1) \geq supp(L_2) and supp(L 2)supp(L 1)supp(L_2) \geq supp(L_1) then

S(L 1)S(L 2)=S(L 1+L 2)=S(L 2+L 1)=S(L 2)S(L 1), S(L_1) S(L_2) = S(L_1 + L_2) = S(L_2 + L_1) = S(L_2) S(L_1) \,,

which is the second clause of the second statement to be shown.

For the last statement, notice that by causal closure of OO the difference L 2L 2L_2 - L_2, which by assumption has supp(L 2L 1)XOsupp(L_2 - L_1) \in X \setminus O, may, according to lemma 1, be written as

L 2L 1=a+r L_2 - L_1 = a + r

such that their causal order (def. 32) is

supp(a)supp(J)supp(r) supp(a) \geq supp(J) \geq supp(r)

It follows with causal additivity and its equivalent formulations above that

Z L 2(J) =Z L 1+a+r(J) =Z L 1+r(J) =S(L 1+r) 1S(L 1+r+J) =S(L 1+r) 1S(J+L 1)S(L 1) 1S(L 1+r) =S(L 1+r) 1S(L 1)S(L 1) 1=idS(L 1+J)S(L 1) 1S(L 1+r) =Z L 1(r) 1Z L 1(J)Z L 1(r) \begin{aligned} Z_{L_2}(J) & = Z_{L_1 + a + r}(J) \\ & = Z_{L_1 + r}(J) \\ & = S(L_1 + r)^{-1} \, S(L_1 + r + J) \\ & = S(L_1 + r)^{-1} \, S(J + L_1) \, S(L_1)^{-1} \, S(L_1 + r) \\ & = S(L_1 + r)^{-1} \underset{= id}{\underbrace{S(L_1) S(L_1)^{-1}}} S(L_1 + J) \, S(L_1)^{-1} \, S(L_1 + r) \\ & = Z_{L_1}(r)^{-1} \, Z_{L_1}(J) \, Z_{L_1}(r) \end{aligned}

and hence the last statement holds for KZ L 1(r)K \coloneqq Z_{L_1}(r).

We now use this fact (prop. 92) to neatly organize the system of localized quantum observables on interacting fields:

Definition

(system of perturbative generating algebras of observables)

Let SS be a perturbative S-matrix according to def. 128 and let L intΩ d,0(E)L_{int} \in \Omega^{d,0}(E) be an interaction Lagrangian density.

For 𝒪Σ\mathcal{O} \subset \Sigma a causally closed subset of spacetime (def. 8) and for g swCutoffs(𝒪)g_{sw} \in Cutoffs(\mathcal{O}) an adiabatic switching function (def. 33) which is constant on a neighbourhood of 𝒪\mathcal{O}, write

Gen g swL int(𝒪)Z g swL int(J)|supp(J)𝒪𝒲[[g/]] Gen_{g_{sw} L_{int}}(\mathcal{O}) \coloneqq \left\langle Z_{g_{sw}L_{int}}(J) \;\vert\; supp(J) \subset \mathcal{O} \right\rangle \;\subset\; \mathcal{W}[ [ g/\hbar] ]

for the smallest subalgebra of the Wick algebra which contains the generating functions for correlation functions (def. 132) of the form Z g swL int(J)Z_{g_{sw}L_{int}}(J), for all those local observables J locJ \in \mathcal{F}_{loc} with supp(J)𝒪supp(J) \subset \mathcal{O}.

Moreover, write

Gen L int(𝒪)g swCutoffs(𝒪)Gen g swL int(𝒪) Gen_{L_{int}}(\mathcal{O}) \;\subset\; \underset{g_{sw} \in Cutoffs(\mathcal{O})}{\prod} Gen_{g_{sw}L_{int}}(\mathcal{O})

be the subalgebra of the Cartesian product of all these algebras as g swg_{sw} ranges, which is generated by the tuples

Z L int(J)(Z g swL int(J)) g swCutoffs(𝒪) Z_{L_{int}}(J) \;\coloneqq\; \left( Z_{g_{sw}L_{int}} (J) \right)_{g_{sw} \in Cutoffs(\mathcal{O})}

for JJ with supp(J)𝒪supp(J) \subset \mathcal{O}.

Finally, for 𝒪 1𝒪 2\mathcal{O}_1 \subset \mathcal{O}_2 an inclusion of two causally closed subsets, let

i 𝒪 1,𝒪 2:Gen L int(𝒪 1)Gen L int(𝒪 2) i_{\mathcal{O}_1, \mathcal{O}_2} \;\colon\; Gen_{L_{int}}(\mathcal{O}_1) \longrightarrow Gen_{L_{int}}(\mathcal{O}_2)

be the algebra homomorphism which is given simply by restricting the index set of tuples.

This construction defines a functor

Gen L int:CausClsdSubsets(Σ)StarAlgebras Gen_{L_{int}} \;\colon\; CausClsdSubsets(\Sigma) \longrightarrow StarAlgebras

from the poset of causally closed subsets of spacetime to the category of star algebras.

(Brunetti-Fredenhagen 99, (65)-(67))

Remark

(algebra of observables well defined up to canonical isomorphism)

By prop. 92, for every causally closed 𝒪X\mathcal{O} \subset X and every g swCutoffs(𝒪)g_{sw} \in Cutoffs(\mathcal{O}) the abstract algebra Gen L int(𝒪)Gen_{L_{int}}(\mathcal{O}) from def. 92 is canonically isomorphic to the subalgebra Gen g swL int(𝒪)𝒲[[g/]]Gen_{g_{sw}L_{int}}(\mathcal{O}) \subset \mathcal{W}[ [ g/\hbar ] ] of formal power series in the Wick algebra.

Beware the slight subtlety in this statement:

The unitary elements KK in 𝒲[[g/]]\mathcal{W}[ [ g/\hbar] ] which exhibit the isomorphisms by conjugation are not unique, since there are many choices of splittings g 2g 1=a+rg_2 - g_1 = a + r in the proof of prop. 92. But the induced isomorphisms between the algebras generated by the Z L int(J)Z_{L_{int}}(J) is independent of this ambiguity, since, again by the proof of prop. 92, conjugation by each such KK gives the same result on the given generators: Z g 1L int(J)Z g 2L int(J)Z_{g_1 L_{int}}(J) \mapsto Z_{g_2 L_{int}}(J).

Proposition

(system of perturbative generating algebras is causally local net of observables)

Given a perturbative S-matrix according to def. 128 and an interaction Lagrangian density L intL_{int}, then the system of generating algebras of observables Gen L intGen_{L_{int}} (def. 134) is a causally local net of observables in that

  1. (isotony) For every inclusion 𝒪 1𝒪 2\mathcal{O}_1 \subset \mathcal{O}_2 of causally closed subsets the corresponding algebra homomorphism is a monomorphism

    i 𝒪 1,𝒪 2:Gen L int(𝒪 1)Gen L int(𝒪 2) i_{\mathcal{O}_1, \mathcal{O}_2} \;\colon\; Gen_{L_{int}}(\mathcal{O}_1) \hookrightarrow Gen_{L_{int}}(\mathcal{O}_2)
  2. (causal locality) For 𝒪 1,𝒪 2X\mathcal{O}_1, \mathcal{O}_2 \subset X two causally closed subsets which are spacelike separated, in that their causal ordering (def. 32) satisfies

    𝒪 1𝒪 2and𝒪 2𝒪 1 \mathcal{O}_1 \geq \mathcal{O}_2 \;\text{and}\; \mathcal{O}_2 \geq \mathcal{O}_1

    then for 𝒪X\mathcal{O} \subset X any further causally closed subset which contains both

    𝒪 1,𝒪 2𝒪 \mathcal{O}_1 , \mathcal{O}_2 \subset \mathcal{O}

    then the corresponding images of the generating algebras of 𝒪 1\mathcal{O}_1 and 𝒪 2\mathcal{O}_2, respectively, commute with each other as subalgebras of the generating algebra of 𝒪\mathcal{O}:

    [i 𝒪 1,𝒪(Gen L int(𝒪 1)),i 𝒪 2,𝒪(Gen L int(𝒪 2))]=0Gen L int(𝒪). \left[ i_{\mathcal{O}_1,\mathcal{O}}(Gen_{L_{int}}(\mathcal{O}_1)) \;,\; i_{\mathcal{O}_2,\mathcal{O}}(Gen_{L_{int}}(\mathcal{O}_2)) \right] \;=\; 0 \;\;\; \in Gen_{L_{int}}(\mathcal{O}) \,.

(Dütsch-Fredenhagen 00, section 3, following Brunetti-Fredenhagen 99, section 8, Il’in-Slavnov 78)

Proof

Isotony is immediate from the definition of the algebra homomorphisms in def. 134.

Causal locality of the system of observables follows from the causal additivity of the S-matrix, by the first clause in the second statement of prop. 92.

In the same kind of way as def. 134 the actual net of algebra of perturbative quantum observables (def. 132) is defined:

Definition

(system of algebras of quantum observables)

Let SS be a perturbative S-matrix according to def. 128 and let L intΩ d,0(E)L_{int} \in \Omega^{d,0}(E) be an interaction Lagrangian density.

For 𝒪Σ\mathcal{O} \subset \Sigma a causally closed subset of spacetime (def. 8) and for g swCutoffs(𝒪)g_{sw} \in Cutoffs(\mathcal{O}) an compatible adiabatic switching function (def. 33) write

Obs g swL int(𝒪)iddϵZ g swL int(ϵJ)| ϵ=0|supp(J)𝒪𝒲[[g]][[]] Obs_{g_{sw} L_{int}}(\mathcal{O}) \coloneqq \left\langle -i \frac{d}{d \epsilon} Z_{g_{sw}L_{int}}(\epsilon J)\vert_{\epsilon = 0} \;\vert\; supp(J) \subset \mathcal{O} \right\rangle \;\subset\; \mathcal{W}[ [ g ] ] [ [ \hbar ] ]

for the smallest subalgebra of the Wick algebra which contains the perturbative quantum observables on interacting fields (def. 132) supported in 𝒪\mathcal{O}.

Moreover, let

Obs L int(𝒪)g swCutoffs(𝒪)Obs g swL int(𝒪) Obs_{L_{int}}(\mathcal{O}) \subset \underset{g_{sw} \in Cutoffs(\mathcal{O})}{\prod} Obs_{g_{sw}L_{int}}(\mathcal{O})

be the subalgebra of the Cartesian product of all these algebras as g swg_{sw} ranges, which is generated by the tuples

iddϵZ L int(ϵJ)| ϵ=0(iddϵZ g swL int(ϵJ)| ϵ=0) g swCutoffs(𝒪) -i \hbar \frac{d}{d \epsilon} Z_{L_{int}}(\epsilon J)\vert_{\epsilon = 0} \;\coloneqq\; \left( - i \hbar \frac{d}{d \epsilon} Z_{g_{sw}L_{int}} (\epsilon J)\vert_{\epsilon = 0} \right)_{g_{sw} \in Cutoffs(\mathcal{O})}

for JJ with supp(J)𝒪supp(J) \subset \mathcal{O}.

Finally, for 𝒪 1𝒪 2\mathcal{O}_1 \subset \mathcal{O}_2 an inclusion of two causally closed subsets, let

i 𝒪 1,𝒪 2:Obs L int(𝒪 1)Obs L int(𝒪 2) i_{\mathcal{O}_1, \mathcal{O}_2} \;\colon\; Obs_{L_{int}}(\mathcal{O}_1) \longrightarrow Obs_{L_{int}}(\mathcal{O}_2)

be the algebra homomorphism which is given simply by restricting the index set of tuples.

This construction defines a functor

Obs L int:CausClsdSubsets(Σ)StarAlgebras Obs_{L_{int}} \;\colon\; CausClsdSubsets(\Sigma) \longrightarrow StarAlgebras

from the poset of causally closed subsets in the spacetime Σ\Sigma to the category of star algebras.

As a corollary of prop. 93 we then have the key result:

Proposition

(system of algebra of perturbative quantum observables is local net of observables)

Given a perturbative S-matrix according to def. 128 and an interaction Lagrangian density L intL_{int}, then the system of algebras of observables Obs L intObs_{L_{int}} (def. 135) is a local net of observables in that

  1. (isotony) For every inclusion 𝒪 1𝒪 2\mathcal{O}_1 \subset \mathcal{O}_2 of causally closed subsets the corresponding algebra homomorphism is a monomorphism

    i 𝒪 1,𝒪 2:Obs L int(𝒪 1)Obs L int(𝒪 2) i_{\mathcal{O}_1, \mathcal{O}_2} \;\colon\; Obs_{L_{int}}(\mathcal{O}_1) \hookrightarrow Obs_{L_{int}}(\mathcal{O}_2)
  2. (causal locality) For 𝒪 1,𝒪 2X\mathcal{O}_1, \mathcal{O}_2 \subset X two causally closed subsets which are spacelike separated, in that their causal ordering (def. 32) satisfies

    𝒪 1𝒪 2and𝒪 2𝒪 1 \mathcal{O}_1 \geq \mathcal{O}_2 \;\text{and}\; \mathcal{O}_2 \geq \mathcal{O}_1

    then for 𝒪X\mathcal{O} \subset X any further causally closed subset which contains both

    𝒪 1,𝒪 2𝒪 \mathcal{O}_1 , \mathcal{O}_2 \subset \mathcal{O}

    then the corresponding images of the generating algebras of 𝒪 1\mathcal{O}_1 and 𝒪 2\mathcal{O}_2, respectively, commute with each other as subalgebras of the generating algebra of 𝒪\mathcal{O}:

    [i 𝒪 1,𝒪(Obs L int(𝒪 1)),i 𝒪 2,𝒪(Obs L int(𝒪 2))]=0Obs L int(𝒪). \left[ i_{\mathcal{O}_1,\mathcal{O}}(Obs_{L_{int}}(\mathcal{O}_1)) \;,\; i_{\mathcal{O}_2,\mathcal{O}}(Obs_{L_{int}}(\mathcal{O}_2)) \right] \;=\; 0 \;\;\; \in Obs_{L_{int}}(\mathcal{O}) \,.

(Dütsch-Fredenhagen 00, below (17), following Brunetti-Fredenhagen 99, section 8, Il’in-Slavnov 78)

Proof

The first point is again immediate from the definition (def. 135).

For the second point it is sufficient to check the commutativity relation on generators. For these the statement follows with prop. 93:

[iddϵ 1Z g swL int(ϵ 1J 1)| ϵ 1=0,iddϵ 2Z g swL int(ϵ 2J 2)| ϵ 2=0] =ddϵ 1ddϵ 2[Z g swL int(ϵ 1J 1),Z g swL int(ϵ 2J 2)]=0| ϵ 1=0ϵ 2=0 =0 \begin{aligned} \left[ -i \frac{d}{d \epsilon_1} Z_{g_{sw}L_{int}}(\epsilon_1 J_1)\vert_{\epsilon_1 = 0} \;,\; -i \frac{d}{d \epsilon_2} Z_{g_{sw}L_{int}}(\epsilon_2 J_2)\vert_{\epsilon_2 = 0} \right] & = - \frac{d}{d \epsilon_1} \frac{d}{d \epsilon_2} \underset{ = 0}{ \underbrace{ \left[ Z_{g_{sw}L_{int}}(\epsilon_1 J_1) \;,\; Z_{g_{sw}L_{int}}(\epsilon_2 J_2) \right]}} \vert_{ {\epsilon_1 = 0} \atop {\epsilon_2 = 0}} \\ & = 0 \end{aligned}

for supp(J 1)supp(J 2)supp(J_1) \geq supp(J_2) and supp(J 2)supp(J 1)supp(J_2) \geq supp(J_1).

\,

Feynman diagrams

So far we considered only the axioms on a consistent perturbative S-matrix /time-ordered products and its formal consequences. Now we discuss the actual construction of time-ordered products, hence of perturbative S-matrices, by the process called renormalization of Feynman diagrams.

We first discuss how time-ordered product, and hence the perturbative S-matrix above, is uniquely determined away from the locus where interaction points coincide (prop. 95 below). Moreover, we discuss how on that locus the time-ordered product is naturally expressed as a sum of products of distributions of Feynman propagators that are labeled by Feynman diagrams: the Feynman perturbation series (prop. 96 below).

This means that the full time-ordered product is an extension of distributions of these scattering amplitudes- to the locus of coinciding vertices. The space of possible such extensions turns out to be finite-dimensional in each order of g/,j/g/\hbar, j/\hbar, parameterizing the choice of point-supported distributions at the interaction points whose scaling degree is bounded by the given Feynman propagators.

Definition

For kk \in \mathbb{N}, write

( locg,j) pds k( locg,j) k \left(\mathcal{F}_{loc}\langle g,j\rangle\right)^{\otimes^k}_{pds} \hookrightarrow \left(\mathcal{F}_{loc}\langle g,j\rangle\right)^{\otimes^k}

for the subspace of the kk-fold tensor product of the space of compactly supported polynomial local densities (def. \ref{CompactlySupportedPolynomialLocalDensities}) on those tuples which have pairwise disjoint spacetime support.

Proposition

(time-ordered product away from the diagonal)

Restricted to ( locg,j) pds k\left(\mathcal{F}_{loc}\langle g,j\rangle\right)^{\otimes^k}_{pds} (def. 136) there is a unique time-ordered product (def. 129), given by the star product that is induced by the Feynman propagator ω F\omega_F

F ω FGprodexp(ω F,δδϕδδϕ)(FG) F \star_{\omega_F} G \;\coloneqq\; prod \circ \exp\left( \hbar \left\langle \omega_F , \frac{\delta}{\delta \phi} \otimes \frac{\delta}{\delta \phi} \right\rangle \right) (F \otimes G)

in that

T(L 1L k)=L 1 ω FL 2 ω F ω FL k. T( L_1 \cdots L_k ) = L_1 \star_{\omega_F} L_2 \star_{\omega_F} \cdots \star_{\omega_F} L_k \,.
Proof

Since the singular support of the Feynman propagator is on the diagonal, and since the support of elements in ( locg,j) pds k\left(\mathcal{F}_{loc}\langle g,j\rangle\right)^{\otimes^k}_{pds} is by definition in the complement of the diagonal, the star product ω F\star_{\omega_F} is well defined. By construction it satisfies the axioms “peturbation” and “normalization” in def. 129. The only non-trivial point to check is that it indeed satisfies “causal factorization”:

Unwinding the definition of the Hadamard state ω\omega and the Feynman propagator ω F\omega_F, we have

ω =i2(Δ RΔ A)+H ω F =i2(Δ R+Δ A)+H \begin{aligned} \omega & = \tfrac{i}{2}( \Delta_R - \Delta_A ) + H \\ \omega_F & = \tfrac{i}{2}( \Delta_R + \Delta_A ) + H \end{aligned}

where the propagators on the right have, in particular, the following properties:

  1. the advanced propagator vanishes when its first argument is not in the causal past of its second argument:

    (supp(F)supp(G))(Δ A,δFδϕδGδϕ=0). (supp(F) \geq supp(G)) \;\Rightarrow\; \left( \left\langle \Delta_A , \frac{\delta F}{\delta \phi} \otimes \frac{\delta G}{\delta \phi} \right\rangle = 0 \right) \,.
  2. the retarded propagator equals the advanced propagator with arguments switched:

    Δ R,δFδϕδGδϕ=Δ A,δGδϕδFδϕ \left\langle \Delta_R , \frac{\delta F}{\delta \phi} \otimes \frac{\delta G}{\delta \phi} \right\rangle = \left\langle \Delta_A , \frac{\delta G}{\delta \phi} \otimes \frac{\delta F}{\delta \phi} \right\rangle
  3. HH is symmetric:

    H,δFδϕδGδϕ=H,δGδϕδFδϕ \left\langle H, \frac{\delta F}{\delta \phi} \otimes \frac{\delta G}{\delta \phi} \right\rangle = \left\langle H, \frac{\delta G}{\delta \phi} \otimes \frac{\delta F}{\delta \phi} \right\rangle

It follows for causal ordering supp(F)supp(G)supp(F) \geq supp(G) (def. 32) that

F ω FG =prodexp(ω F,δδϕδδϕ)(FG) =prodexp(i2(Δ R+Δ A)+H,δδϕδδϕ)(FG) =prodexp(i2Δ R+H,δδϕδδϕ)(FG) =prodexp(i2(Δ RΔ A)+H,δδϕδδϕ)(FG) =prodexp(ω,δδϕδδϕ)(FG) =F ωG \begin{aligned} F \star_{\omega_F} G & = prod \circ \exp\left( \hbar \left\langle \omega_F , \frac{\delta }{\delta \phi} \otimes \frac{\delta}{\delta \phi} \right\rangle \right) ( F \otimes G ) \\ & = prod \circ \exp\left( \hbar \left\langle \tfrac{i}{2}( \Delta_R + \Delta_A ) + H , \frac{\delta }{\delta \phi} \otimes \frac{\delta}{\delta \phi} \right\rangle \right) ( F \otimes G ) \\ & = prod \circ \exp\left( \hbar \left\langle \tfrac{i}{2}\Delta_R + H , \frac{\delta }{\delta \phi} \otimes \frac{\delta}{\delta \phi} \right\rangle \right) ( F \otimes G ) \\ & = prod \circ \exp\left( \hbar \left\langle \tfrac{i}{2}( \Delta_R - \Delta_A ) + H , \frac{\delta }{\delta \phi} \otimes \frac{\delta}{\delta \phi} \right\rangle \right) ( F \otimes G ) \\ & = prod \circ \exp\left( \hbar \left\langle \omega , \frac{\delta }{\delta \phi} \otimes \frac{\delta}{\delta \phi} \right\rangle \right) ( F \otimes G ) \\ & = F \star_{\omega} G \end{aligned}

and for supp(G)supp(F)supp(G) \geq supp(F) that

F ω FG =prodexp(ω F,δδϕδδϕ)(FG) =prodexp(i2(Δ R+Δ A)+H,δδϕδδϕ)(FG) =prodexp(i2Δ A+H,δδϕδδϕ)(FG) =prodexp(i2Δ R+H,δδϕδδϕ)(GF) =prodexp(i2(Δ RΔ A)+H,δδϕδδϕ)(GF) =G ωF. \begin{aligned} F \star_{\omega_F} G & = prod \circ \exp\left( \hbar \left\langle \omega_F , \frac{\delta }{\delta \phi} \otimes \frac{\delta}{\delta \phi} \right\rangle \right) ( F \otimes G ) \\ & = prod \circ \exp\left( \hbar \left\langle \tfrac{i}{2}( \Delta_R + \Delta_A ) + H , \frac{\delta }{\delta \phi} \otimes \frac{\delta}{\delta \phi} \right\rangle \right) ( F \otimes G ) \\ & = prod \circ \exp\left( \hbar \left\langle \tfrac{i}{2} \Delta_A + H , \frac{\delta }{\delta \phi} \otimes \frac{\delta}{\delta \phi} \right\rangle \right) ( F \otimes G ) \\ & = prod \circ \exp\left( \hbar \left\langle \tfrac{i}{2} \Delta_R + H , \frac{\delta }{\delta \phi} \otimes \frac{\delta}{\delta \phi} \right\rangle \right) ( G \otimes F ) \\ & = prod \circ \exp\left( \hbar \left\langle \tfrac{i}{2} (\Delta_R - \Delta_A) + H , \frac{\delta }{\delta \phi} \otimes \frac{\delta}{\delta \phi} \right\rangle \right) ( G \otimes F ) \\ & = G \star_{\omega} F \,. \end{aligned}

This shows that F\star_F is a consistent time-ordered product on the subspace of functionals with disjoint support. It is immediate from the above that it is the unique solution on this subspace.

Remark

(time-ordered product is assocativative)

Prop. 95 implies in particular that the time-ordered product is associative, in that

T(T(V 1V k 1)T(V k n1+1V k n))=T(V 1V k 1V k n1+1V n n). T( T(V_1 \cdots V_{k_1}) \cdots T(V_{k_{n-1}+1} \cdots V_{k_n} ) ) = T( V_1 \cdots V_{k_1} \cdots V_{k_{n-1}+1} \cdots V_{n_n} ) \,.

It follows that the problem of constructing time-ordered products, and hence (by prop. 91) the perturbative S-matrix, consists of finding compatible extension of the distribution prodexp(ω F,δδϕδδϕ) prod \circ \exp\left( \left\langle \omega_F , \frac{\delta }{\delta \phi} \otimes \frac{\delta}{\delta \phi} \right\rangle \right) to the diagonal.

Moreover, by the nature of the exponential expression, this means in each order to extend products of Feynman propagators labeled by graphs whose vertices correspond to the polynomial factors in FF and GG and whose edges indicate over which variables the Feynman propagators are to be multiplied.

Definition

(scalar field Feynman diagram)

A scalar field Feynman diagram Γ\Gamma is

  1. a natural number v𝒩v \in \mathcal{N} (number of vertices);

  2. a vv-tuple of elements (V r locg,j) r{1,,v}(V_r \in \mathcal{F}_{loc} \langle g,j\rangle)_{r \in \{1, \cdots, v\}} (the interaction and external field vertices)

  3. for each a<b{1,,v}a \lt b \in \{1, \cdots, v\} a natural number e a,be_{a,b} \in \mathbb{N} (“of edges from the aath to the bbth vertex”).

For a given tuple (V j)(V_j) of interaction vertices we write

FDiag (V j) FDiag_{(V_j)}

for set of scalar field Feynman diagrams with that tuple of vertices.

Proposition

(Feynman perturbation series away from coinciding vertices)

For vv \in \mathbb{N} the vv-fold time-ordered product away from the diagonal, given by prop. 95

T v:( locg,j) pds v𝒲[[g/,j/]] T_v \;\colon\; \left(\mathcal{F}_{loc}\langle g,j\rangle\right)_{pds}^{\otimes^{v}} \longrightarrow \mathcal{W}[ [ g/\hbar, j/\hbar] ]

is equal to

T k(V 1V v)=prodΓ𝒢 (V j) j=1 vr<s{1,,v}1e r,s!ω F,δ e r,sδϕ r e r,sδ e r,sδϕ s e r,s(V 1V v), T_k(V_1 \cdots V_v) \;=\; prod \circ \underset{\Gamma \in \mathcal{G}_{(V_j)_{j = 1}^{v}}}{\sum} \underset{ r \lt s \in \{1, \cdots, v\} }{\prod} \tfrac{1}{e_{r,s}!} \left\langle \hbar \omega_F \,,\, \frac{\delta^{e_{r,s}}}{\delta \phi_r^{e_{r,s}}} \frac{\delta^{e_{r,s}}}{ \delta \phi_s^{e_{r,s}} } \right\rangle (V_1 \otimes \cdots \otimes V_v) \,,

where the edge numbers e r,s=e r,s(Γ)e_{r,s} = e_{r,s}(\Gamma) are those of the given Feynman diagram Γ\Gamma.

(Keller 10, IV.1)

Proof

We proceed by induction over the number of vertices. The statement is trivially true for a single vertex. Assume it is true for v1v \geq 1 vertices. It follows that

T(V 1V vV v+1) =T(T(V 1V v)V v+1) =prodexp(ω F,δδϕδδϕ)(prodΓ𝒢 (V j) j=1 vr>s{1,,v}1e r,s!ω F,δ e r,sδϕ r e r,sδ e r,sδϕ s e r,s(V 1V v))V v+1 =prodΓ𝒢 (V j) j=1 vr>s{1,,v}1e r,s!ω F,δ e r,sδϕ r e r,sδ e r,sδϕ s e r,s(e 1,v+1,e v,v+1t{1,v}1e t,v+1!(δ e 1,v+1V 1δϕ 1 e 1,v+1δ e v,v+1V vδϕ v e v,v+1)δ e 1,v+1++e v,v+1V v+1δϕ v1 e 1,v+1++e v,v+1) =prodΓ𝒢 (V j) j=1 v+1r<s{1,,v+1}1e r,s!ω F,δ e r,sδϕ r e r,sδ e r,sδϕ s e r,s(V 1V v+1) \begin{aligned} T(V_1 \cdots V_v V_{v+1}) & = T( T(V_1 \cdots V_v) V_{v+1} ) \\ &= prod \circ \exp\left( \left\langle \hbar \omega_F, \frac{\delta}{\delta \phi} \otimes \frac{\delta}{\delta \phi} \right\rangle \right) \left( prod \circ \underset{\Gamma \in \mathcal{G}_{(V_j)_{j = 1}^{v}}}{\sum} \underset{ r \gt s \in \{1, \cdots, v\} }{\prod} \frac{1}{e_{r,s}!} \left\langle \hbar \omega_F \,,\, \frac{\delta^{e_{r,s}}}{\delta \phi_r^{e_{r,s}}} \frac{\delta^{e_{r,s}}}{ \delta \phi_s^{e_{r,s}} } \right\rangle (V_1 \otimes \cdots \otimes V_v) \right) \;\otimes\; V_{v+1} \\ & = prod \circ \underset{\Gamma \in \mathcal{G}_{(V_j)_{j = 1}^{v}}}{\sum} \underset{ r \gt s \in \{1, \cdots, v\} }{\prod} \tfrac{1}{e_{r,s}!} \left\langle \hbar \omega_F \,,\, \frac{\delta^{e_{r,s}}}{\delta \phi_r^{e_{r,s}}} \frac{\delta^{e_{r,s}}}{ \delta \phi_s^{e_{r,s}} } \right\rangle \left( \underset{e_{1,{v+1}}, \cdots e_{v,v+1} \in \mathbb{N}}{\sum} \underset{t \in \{1, \cdots v\}}{\prod} \tfrac{1}{e_{t,v+1} !} \left( \frac{\delta^{e_{1,v+1}} V_1 }{\delta \phi_{1}^{e_{1,v+1}}} \otimes \cdots \otimes \frac{ \delta^{e_{v,v+1}} V_v}{ \delta \phi_{v}^{e_{v,v+1}} } \right) \;\otimes\; \frac{\delta^{e_{1,v+1} + \cdots + e_{v,v+1}} V_{v+1}}{\delta \phi_{v-1}^{e_{1,v+1} + \cdots + e_{v,v+1}}} \right) \\ &= prod \circ \underset{\Gamma \in \mathcal{G}_{(V_j)_{j = 1}^{v+1}}}{\sum} \underset{ r \lt s \in \{1, \cdots, v+1\} }{\prod} \tfrac{1}{e_{r,s}!} \left\langle \hbar \omega_F \,,\, \frac{\delta^{e_{r,s}}}{\delta \phi_r^{e_{r,s}}} \frac{\delta^{e_{r,s}}}{ \delta \phi_s^{e_{r,s}} } \right\rangle (V_1 \otimes \cdots \otimes V_{v+1}) \end{aligned}

Here in the first step we use the associativity of the time-ordered product (remark 31), in the second step we use the induction assumption, in the third we pass the outer functional derivatives through the pointwise product using the product rule, and in the fourth step we recognize that this amounts to summing in addition over all possible choices of sets of edges from the first vv vertices to the new v+1v+1st vertex, which yield in total the sum over all diagrams with v+1v+1 vertices.

Remark

(loop order and powers of Planck's constant)

From prop. 96 one deduces that the order in Planck's constant that a (planar) Feynman diagram contributes to the S-matrix is given (up to a possible offset due to external vertices) by the “number of loops” in the diagram.

In the computation of scattering amplitudes for fields/particles via perturbative quantum field theory the scattering matrix (Feynman perturbation series) is a formal power series in (the coupling constant and) Planck's constant \hbar whose contributions may be labeled by Feynman diagrams. Each Feynman diagram Γ\Gamma is a finite labeled graph, and the order in \hbar to which this graph contributes is

E(Γ)V(Γ) \hbar^{ E(\Gamma) - V(\Gamma) }

where

  1. V(Γ)V(\Gamma) \in \mathbb{N} is the number of vertices of the graph

  2. E(Γ)E(\Gamma) \in \mathbb{N} is the number of edges in the graph.

This comes about, according to the above, because

  1. the explicit \hbar-dependence of the S-matrix is

    S(gL int)=kg k kk!T(L intL intkfactors) S\left(\tfrac{g}{\hbar} L_{int} \right) = \underset{k \in \mathbb{N}}{\sum} \frac{g^k}{\hbar^k k!} T( \underset{k \, \text{factors}}{\underbrace{L_{int} \cdots L_{int}}} )
  2. the further \hbar-dependence of the time-ordered product T()T(\cdots) is

    T(L intL int)=prodexp(ω F(x,y)δδϕ(x)δδϕ(y))(L intL int), T(L_{int} L_{int}) = prod \circ \exp\left( \hbar \int \omega_{F}(x,y) \frac{\delta}{\delta \phi(x)} \otimes \frac{\delta}{\delta \phi(y)} \right) ( L_{int} \otimes L_{int} ) \,,

where ω F\omega_F denotes the Feynman propagator and ϕ(x)\phi(x) the field observable at point xx (where we are notationally suppressing the internal degrees of freedom of the fields for simplicity, writing them as scalar fields, because this is all that affects the counting of the \hbar powers).

The resulting terms of the S-matrix series are thus labeled by

  1. the number of factors of the interaction L intL_{int}, these are the vertices of the corresponding Feynman diagram and hence each contibute with 1\hbar^{-1}

  2. the number of integrals over the Feynman propagator ω F\omega_F, which correspond to the edges of the Feynman diagram, and each contribute with 1\hbar^1.

Now the formula for the Euler characteristic of planar graphs says that the number of regions in a plane that are encircled by edges, the faces here thought of as the number of “loops”, is

L(Γ)=1+E(Γ)V(Γ). L(\Gamma) = 1 + E(\Gamma) - V(\Gamma) \,.

Hence a planar Feynman diagram Γ\Gamma contributes with

L(Γ)1. \hbar^{L(\Gamma)-1} \,.

So far this is the discussion for internal edges. An actual scattering matrix element is of the form

ψ out|S(gL int)|ψ in, \langle \psi_{out} \vert S\left(\tfrac{g}{\hbar} L_{int} \right) \vert \psi_{in} \rangle \,,

where

|ψ in1 n inϕ (k 1)ϕ (k n in)|vac \vert \psi_{in}\rangle \propto \tfrac{1}{\sqrt{\hbar^{n_{in}}}} \phi^\dagger(k_1) \cdots \phi^\dagger(k_{n_{in}}) \vert vac \rangle

is a state of n inn_{in} free field quanta and similarly

|ψ out1 n outϕ (k 1)ϕ (k n out)|vac \vert \psi_{out}\rangle \propto \tfrac{1}{\sqrt{\hbar^{n_{out}}}} \phi^\dagger(k_1) \cdots \phi^\dagger(k_{n_{out}}) \vert vac \rangle

is a state of n outn_{out} field quanta. The normalization of these states, in view of the commutation relation [ϕ(k),ϕ (q)][\phi(k), \phi^\dagger(q)] \propto \hbar, yields the given powers of \hbar.

This means that an actual scattering amplitude given by a Feynman diagram Γ\Gamma with E ext(Γ)E_{ext}(\Gamma) external vertices scales as

L(Γ)1+E ext(Γ)/2. \hbar^{L(\Gamma) - 1 + E_{ext}(\Gamma)/2 } \,.

(For the analogous discussion of the dependence on the actual quantum observables on \hbar given by Bogoliubov's formula, see there.)

\,

Renormalization

renormalization

\,

Quantum Electrodynamics

(…)

\,


\,

\,

Revised on December 4, 2017 07:14:15 by Urs Schreiber (178.6.238.237)